SMILES import problems (molconvert has none)

User 677b9c22ff

15-11-2006 20:03:43

Hi,


importing smiles into Instant-JChem gives a SQL error.


O=C1C2=C(C=CC3=C2OCO3)C(C5(H)N(C)CCC4=CC(OCO6)=C6C=C45)(H)O1


O=C(OC)C(C(O)CC5)C(C5(H)C4)(H)CC1(H)N4CCC2=C1NC3=C2C=CC=C3


N(H)(H)H





However molconvert has no issues and takes the compounds without problems. Wouldnt it be good to forward such an error to molconvert and let it try?





Another issue is during migrating files to Instant-JChem it will generate alot of hazzle if the structure is just excluded (shown in import field) but other fields contain names or values. In such a case the whole DB structure is destroyed, and the import error is nowhere recorded in the localdb window. (the "allow empty structures" feature does not help here).





So if comparing to the old DB which had like 3,000,000 molecules the new imported DB has only 2,999,997 molecules. If I try to merge fields which were calculated with I-JChem with my old DB, I will have a lot of trouble.





Tobias

ChemAxon fa971619eb

16-11-2006 09:29:39

thanks for tha report. The problem jsut seems to be with those 3 smiles strings. Most smiles seem to import fine. We are investigating this.





When a row fails to import it is listed so that you can investigate this further.


What other action would you suggest?

ChemAxon 9c0afc9aaf

16-11-2006 15:11:05

Hi,





These are faulty SMILES strings.


The hydrogen atoms should appear inside brackets, for example the correct form of





N(H)(H)H





is





N([H])([H])[H].





For further information please see the Daylight documentation:


http://www.daylight.com/dayhtml_tutorials/languages/smiles/index.html width="90%" cellspacing="0" cellpadding="3" border="0" align="center"> Quote: Hydrogen is NOT part of the "organic subset" and therefore needs brackets. We do not recognize the faulty strings as SMILES, but if the format is known (e.g. from file extension) the less strict import code accepts it.


This explains why it causes problem only a specific parts of our code.





Logically we should never accept these SMILES, but we do not think it is a major bug that we are able to import them under certain circumstances.





Best regards,





Szilard

User 677b9c22ff

17-11-2006 06:43:11

Hi,


yes I agree, strong rules are ok. But this is the problem with SMILES, they never where error-free or unique. My only concern is that if I have alarge DB or I want to import a large list with atached values in columns which may be more important than the structure itself, they are just discarded. But maybe its good to let people clean their data first.


Tobias