User d83ec9d6e4
21-01-2005 17:40:27
The unique smiles algorithm is very powerful, but there's one limitation that I don't quite understand, why can't labels be respected during canonicalization?
I notice that the unqiue smiles for
[C:1]NC is [CH3:1]NC
while for CN[C:1] it is CN[CH3:1].
If you substitute a charge instead of an atom map it works fine:
[C+]NC -> CN[CH2+] and CN[C+] -> CN[CH2+]
I've hacked out a way to use other properties like charge to resolve these kinds of issues where I just need a unique smiles for a labeled molecule, but shouldn't it be straight-forward to include these tags directly into the canonicalization algorithm (thus making it unqiue for all cxsmiles)?
I notice that the unqiue smiles for
[C:1]NC is [CH3:1]NC
while for CN[C:1] it is CN[CH3:1].
If you substitute a charge instead of an atom map it works fine:
[C+]NC -> CN[CH2+] and CN[C+] -> CN[CH2+]
I've hacked out a way to use other properties like charge to resolve these kinds of issues where I just need a unique smiles for a labeled molecule, but shouldn't it be straight-forward to include these tags directly into the canonicalization algorithm (thus making it unqiue for all cxsmiles)?