User 918876d6ff
13-03-2013 17:23:24
I encounter a really strange problem
I have two chemical compounds, in 2 different databases, that are supposed to be the same:
- kegg:C03516
http://www.genome.jp/dbget-bin/www_bget?C03516
- CHEBI:15431 http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:15431
If I load these two compounds with MolImporter using their respective molfiles ( http://www.genome.jp/dbget-bin/www_bget?-f+m+compound+C03516 for kegg:C03516
http://www.ebi.ac.uk/chebi/saveStructure.do;jsessionid=4803C46E7EC8609A093ABB7548715337?defaultImage=true&chebiId=15431&imageId=0 for CHEBI:15431) and ask for the InchI and the SMILES of each, jchem return the same InchI and the same SMILES for both, which is correct :
and
InChI returned for kegg:C03516
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16-;
SMILES returned for kegg:C03516
CC1=C(CCC(O)=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC(O)=O)=C5C)C(C=C)=C4C
InChI returned for chebi:15431
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16-;
SMILES returned for chebi:15431
CC1=C(CCC(O)=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC(O)=O)=C5C)C(C=C)=C4C
But now, if I ask for the majormicrospecies at pH 7.3, I have different results depending wether I used the kegg or the chebi molfile, which is strange:
InChI returned for major microspecies of kegg:C03516 at pH 7.3
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-4/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16-;
SMILES returned for major microspecies of kegg:C03516 at pH 7.3
CC1=C(CCC([O-])=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC([O-])=O)=C5C)C(C=C)=C4C
InChI returned for major microspecies of chebi:15431 at pH 7.3
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16,36-37H,1-2,9-12H2,3-6H3,(H,39,40)(H,41,42);/q-2;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16-;
SMILES returned for major microspecies of chebi:15431 at pH 7.3
CC1=C2NC(\C=C3/N4[Mg]N5\C(=C/2)C(C)=C(C=C)\C\5=C\C2=C(C)C(C=C)=C(N2)\C=C4\C(C)=C3CCC([O-])=O)=C1CCC([O-])=O
Any explanation to this?
Also, even more strange. If instead of importing the kegg compound from the molfile, I import it using its InChI and ask jchem to give me back the InChI of this molecule, I have different values:
InChI used for the import:
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13-,27-14-,28-15-,29-14-,30-15-,31-16-,32-16-;
InChI returned by jchem:
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13-,26-13?,27-14?,28-15-,29-14-,30-15?,31-16-,32-16?;
So why is some information, encoded in the InChI used for the import, lost during the process?
SMILES returned by jchem:
[Mg++].[H]\C1=C2\[N-]\C(=C([H])/C3=N/C(=C([H])\C4=N\C(=C([H])/C5=C(C=C)C(C)=C1N5)\C(C)=C4CCC(O)=O)/C(CCC([O-])=O)=C3C)C(C)=C2C=C
which differ from the one obtained when imported from the molfile...
And if now I ask for the major microspecies of this compound (the one imported from the InchI) at pH 7.3 , the information of the Mg is lost...:
InChI returned by jchem for the major microspecies at pH 7.3:
InChI=1S/C34H34N4O4/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25/h7-8,13-16,35-36H,1-2,9-12H2,3-6H3,(H,39,40)(H,41,42)/p-2
SMILES returned by jchem for the major microspecies at pH 7.3:
CC1=C(CCC([O-])=O)C2=NC1=CC1=C(C=C)C(C)=C(N1)C=C1NC(=CC3=NC(=C2)C(CCC([O-])=O)=C3C)C(C)=C1C=C
the same kind of inconsistencies occur if I import from the SMILES
SMILES used for the import:
CC1=C(CCC(O)=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC(O)=O)=C5C)C(C=C)=C4C
InChI returned by jchem:
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-2/b25-13?,26-13-,27-14-,28-15-,29-14?,30-15?,31-16?,32-16-;
SMILES returned by jchem:
CC1=C(CCC(O)=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC(O)=O)=C5C)C(C=C)=C4C
this time the SMILE used for the import and the one returned are coherent
InChI returned by jchem for the major microspecies at pH 7.3:
InChI=1S/C34H34N4O4.Mg/c1-7-21-17(3)25-13-26-19(5)23(9-11-33(39)40)31(37-26)16-32-24(10-12-34(41)42)20(6)28(38-32)15-30-22(8-2)18(4)27(36-30)14-29(21)35-25;/h7-8,13-16H,1-2,9-12H2,3-6H3,(H4,35,36,37,38,39,40,41,42);/q;+2/p-4
SMILES returned by jchem for the major microspecies at pH 7.3:
CC1=C(CCC([O-])=O)C2=CC3=C(CCC([O-])=O)C(C)=C4C=C5N=C(C=C6N([Mg]N34)C(=CC1=N2)C(C)=C6C=C)C(C)=C5C=C
this SMILES differ from the one obtained when importing from the molfile and asking for the major microspecies at pH 7.3 :
CC1=C(CCC([O-])=O)C2=N\C\1=C/C1=C(C)C(C=C)=C3\C=C4/N=C(/C=C5\N([Mg]N13)/C(=C\2)C(CCC([O-])=O)=C5C)C(C=C)=C4C
So my question is, why did the results differ depending on the format used for the import?
PS: I use jchem release 5.12