User a11e9761d6
10-11-2008 21:28:23
We have a SMILES string:
Cn/1cccc\c1=N\c2cccc[n+]2C
that we insert into our database via JChem Base. Once inserted, the cd_smiles and cd_structure columns are as follows:
cd_smiles: Cn1-cccc\c1=N\c1cccc[n+]1C
cd_structure: Cn/1cccc\c1=N\c2cccc[n+]2C
When this second SMILES (the cd_smiles) is imported it resolves to a different unique structure, even after dearomatizing and removing explicit Hs.
Neither of these structures appear to have correct SMILES. Correct SMILES would be:
N(c1[n+](cccc1)C)=C2\C=C/C=C\N2C
or
Cn1ccccc1=Nc1cccc[n+]1C
(ChemSpider, for example, is able to find the correct structure when a search for the original SMILES is run).
We suspect it is a structure cleaning issue from the original SMILES because there is a forward slash after the first 'n'. It seems that JChem should either clean this SMILES correctly or simply reject it as an invalid SMILES string. Is this a known issue? Any recommendations for how to handle this?
Thanks,
Krishna Dole
Cn/1cccc\c1=N\c2cccc[n+]2C
that we insert into our database via JChem Base. Once inserted, the cd_smiles and cd_structure columns are as follows:
cd_smiles: Cn1-cccc\c1=N\c1cccc[n+]1C
cd_structure: Cn/1cccc\c1=N\c2cccc[n+]2C
When this second SMILES (the cd_smiles) is imported it resolves to a different unique structure, even after dearomatizing and removing explicit Hs.
Neither of these structures appear to have correct SMILES. Correct SMILES would be:
N(c1[n+](cccc1)C)=C2\C=C/C=C\N2C
or
Cn1ccccc1=Nc1cccc[n+]1C
(ChemSpider, for example, is able to find the correct structure when a search for the original SMILES is run).
We suspect it is a structure cleaning issue from the original SMILES because there is a forward slash after the first 'n'. It seems that JChem should either clean this SMILES correctly or simply reject it as an invalid SMILES string. Is this a known issue? Any recommendations for how to handle this?
Thanks,
Krishna Dole