Tautomers fail under jc_compare using t:ff option, tdf:y idx

User 7f33ec9a5c

10-12-2012 21:27:49

Hi,


We tested a large chemical catalog with over a milliion structures, and found a few (145) cases where SMILES could be matched to other smiles in the database using jcf.standardize(<smiles>, 'config:tautomerize outFormat:smiles:u'), but the index-backed jc_compare failed to match the smiles using 't:ff tautomerSearch:y charge:i radical:i stereoSearchType:i'


The attached file lists the 145 smiles as they existed in the catalog and the database.  The columns in the file are as follows:


fwd = the result ofjcf.compare(smi_1, smi_2, 't:ff tautomerSearch:y charge:i radical:i stereoSearchType:i') 


rev = result ofjcf.compare(smi_2, smi_1, 't:ff tautomerSearch:y charge:i radical:i stereoSearchType:i') 


smi_1 = first smiles


smi_2 = second smiles


tautomeric form = This is the SAME for both smiles, or  jcf.standardize(smi_1, 'config:tautomerize outFormat:smiles:u') =  jcf.standardize(smi_2, 'config:tautomerize outFormat:smiles:u')


 


Any suggestions about why the t:ff search failed when both smiles resolve to the same tautomer would be appreciated.

ChemAxon abe887c64e

11-12-2012 12:25:43

Thank you for sending us your observations. We will investigate your data soon.


Best regards,


Krisztina

ChemAxon abe887c64e

03-01-2013 11:09:49

We are now working on our new version of JChem (5.12) to be released in the next month, containing improvements of tautomer search. When running the same queries with this version of JChem currently 112 out of the sent 145 compound pairs are hits of each other.


See attached the results of searches (t:ff tautomerSearch:y charge:i radical:i stereoSearchType:i)


query: smi_1.smi target: smi_2.smi result: result_12.smi


query: smi_2.smi target: smi_1.smi result: result_21.smi


We keep on investigating the rest 33 compound pairs.


Thank you for your patience.


Krisztina

ChemAxon d4fff15f08

03-06-2013 11:47:56

We have investigated the case of those 33 compounds that were not hits in JChem (5.12). All the 33 pairs are resonance structures (see image attached), which are not meant to be hits with the given settings. To find them you may want to use other standardization procedures (we can help you in this issue).


Best regards,


Norbert