Marvin SMILES canonicalization vs Standardizer

User 83c8dbce58

30-01-2009 21:30:09

I've noticed that MarvinSketch appears to handle the canoncialization of SMILES differently than does Standardizer. We are using MarvinSketch v5.1.3_2 and Standardizer v1.5.





Below, please note the differences in how MarvinSketch and Standardizer handle the following SMILES, which are pasted into MarvinSketch, and also entered into a text box that then submits the SMILES string to Standardizer





Original query: N1C=CC2=C1C=CC=C2





Marvin: c1ccc2[nH]ccc2c1





Standardizer: N1C=CC2=C1C=CC=C2








Original query: C1CCC2=C(C1)NC3=C2C=CC=C3





Marvin: C1CCc2c(C1)[nH]c1ccccc21





Standardizer: C1CCC2=C(C1)NC3=C2C=CC=C3














Any idea why Marvin apparently changes the SMILES string to a much greater degree than does Standardizer?














Thanks,














Nicko

ChemAxon 909aee4527

05-02-2009 10:42:53

Dear Nicko,





we are investigating it and return soon with an answer.





Kind regards,


Judit

ChemAxon 25dcd765a3

05-02-2009 15:58:53

The unique form of the given SMILES is the one produced by marvin. It is the aromatized form.





Standardizer standardize the original molecules (according to some predefined rules) and does not aromatize them (if aromatization is not on the list). So it does not produce unique smiles, but canonical smiles.








All the best


Andras