"Standardizer" can't produce only one identical SM

User 941c2467a3

02-04-2007 19:49:23

Hi,





I met a problem when put a SMILES string to the "Standardizer". The problem is that the "Standardizer" could not make the pre-standardized SMILES string to one identical SMILES string. (When I used the output SMILES string as the input SMILES string, and then put it into the "Standardizer", I got the original input SMILES string again. The "Standardizer" just changed the trans-cis information in the SMILES string and then changed it back and so on.)





The SMILES string is,


[O-]\[N+]([O-])=C1\C\C(C(=O)\C(C1)=[N+](/[O-])[O-])=[N+](/[O-])[O-]





at


http://umbbd.msi.umn.edu/servlets/pageservlet?ptype=c&compID=c0977





The standardization configuration file,





<?xml version="1.0" encoding="UTF-8"?>


<!-- Chemaxon Standardizer configuration file -->





<StandardizerConfiguration Version ="0.1">


<Actions>


<Action ID="aromatize" Act="aromatize"/>


<Action ID="dehydrogenze" Act="dehydrogenize"/>


</Actions>


</StandardizerConfiguration>








Thanks a lot!





Jeff

User 941c2467a3

02-04-2007 21:04:09

Additional, the two SMILES strings got from the "Standardizer" are,


[O-]\[N+]([O-])=C1\C\C(C(=O)\C(C1)=[N+](/[O-])[O-])=[N+](/[O-])[O-]


and


[O-]\[N+]([O-])=C1/C\C(C(=O)\C(C1)=[N+](\[O-])[O-])=[N+](\[O-])[O-]


When I pasted them into MarvinSketch, they have the same structure.





Sorry for posting this message in the wrong forum :-(





Jeff

ChemAxon 9c0afc9aaf

02-04-2007 21:20:34

Hi,
Quote:
Sorry for posting this message in the wrong forum :-(
No worries, I have moved it here.





Could you let us know the exact JChem version you are using ?


(this is very important for any bug report)





Regards,





Szilard

User 941c2467a3

02-04-2007 22:08:19

Szilard wrote:
Hi,
Quote:
Sorry for posting this message in the wrong forum :-(
No worries, I have moved it here.





Could you let us know the exact JChem version you are using ?


(this is very important for any bug report)





Regards,





Szilard
Thanks! Currently we use Jchem 3.2.3.





Jeff

ChemAxon a3d59b832c

03-04-2007 07:26:45

Hi Jeff,





Did you try unique smiles output? (Format string "smiles:u" instead of "smiles".) It seems to me that it does the trick.





Szabolcs

ChemAxon a3d59b832c

03-04-2007 07:38:04

Some explanation:





Smiles output is not unique by default, but the smiles output may depend on atom numbering and other features. There is, however a slightly slower algorithm to generate the canonical(unique) smiles strings.





Please note that this is entirely an output issue. Standardizer does all canonicalization on the molecule structure level, according to your business rules in the configuration. It does not deal with atom numbering or output smiles canonicalization.





I hope this helps,





Szabolcs