Different search results with smiles and standard inchi

User 779e37e0e6

12-11-2015 02:53:10

Hi,


I am running supersructure search, and in my debugging, I have realized that using smiles and standard inchi for earch operations sometimes return different results, especially when considering the stereochemistry.


For instance, I have the following SMARTS patterns for L- and D-alpha amino acids


L-: [$([H][C@@;X4]([#6;!$(C(=O)O)])([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])[#6;X3]([#8;A;X2H1,X1-])=[O;X1]),$([#6][C@@;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])([#6][!#1!#6])[#6;X3]([#8;A;X2H1,X1-])=[O;X1]),$([#6]-[#6][C@;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])([#6]-[#1,!#6])[#6;X3]([#8;A;X2H1,X1-])=[O;X1])]


D-: [$([H][C@;X4]([#6;!$(C(=O)O)])([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])[#6;X3]([#8;A;X2H1,X1-])=[O;X1]),$([#6][C@;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])([#6][!#1!#6])[#6;X3]([#8;A;X2H1,X1-])=[O;X1]),$([#6]-[#6][C@@;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*!@[#7,#8,#15,#16])])([#6]-[#1,!#6])[#6;X3]([#8;A;X2H1,X1-])=[O;X1])]


Standard Inchis are standard and generated using the function: MolExporter.exportToFormat(molecule, "inchi:SAbs"), where molecue is an instance of the class Molecule.


Here are the results for L-Proline and D-Glutamine:


[H][C@]1(CCCN1)C(O)=O (L-proline)


- Using smiles: l-alpha-amino-acid_2


-using inchi: d-alpha-amino-acid_2



[H][C@@](N)(CCC(O)=N)C(O)=O (D-glutamine)


- using smiles: d-alpha-amino-acid_2


- using smiles: l-alpha-amino-acid_2


Could you please help me out  here? What could be the cause.


Regards,


MrYan

ChemAxon abe887c64e

16-11-2015 10:57:33

Dear MrYan,


We could reproduce the opposite hits when the query structure was applied in inchi format. After a quick investigation our suggestion would be to use the function in a modified format


MolExporter.exportToFormat(molecule, "inchi:SAbs,AuxNone")

by using the AuxNone option as well, Our guess is that the error happens when the auxiliary information is written out into the inchi format.


Would you check this suggestion?


Krisztina

User 779e37e0e6

17-11-2015 00:50:01

Thanks Kriztina,


 


I have tried it out and it works for L-proline but not for D-glutamine. To my surpise, Converting D-glutamine ([H][C@@](N)(CCC(O)=N)C(O)=O) to inchi and using this to perform structure search still returns l-alpha-amino-acid_2.


Regards,


MrYan

ChemAxon abe887c64e

17-11-2015 09:43:16

Hi MrYan,


It is really strange, I cannot reproduce the opposite hit with D-glutamine. Exporting D-glutamine ([H][C@@](N)(CCC(O)=N)C(O)=O) to inchi without the auxiliary data saves the (R) configuration, and the resulting inchi hits only the D-alpha-amino acid pattern.


What is the version number of your JChem?


Krisztina