IUPAC naming issue

User cf4264f752

22-11-2011 12:29:10

Hi,


 


thanks for your datailed answer.


But unfortunately it seems that our chemists are having other problems with Marvin that keep them from using it. 


For structure Corticosterone.mol Marvin Sketch generates the following IUPAC name


(1S,2R,10S,11S,14S,15S,17S)-17-hydroxy-14-(2-
hydroxyacetyl)-2,15- dimethyltetracyclo[8.7.0.0^{2,7}.0^{11,15}]
heptadec-6- en-5-one


and nothing can be found in PucChem database using this name. We can find this compound in PubChem by structure (or just by simple name) and see it has the other IUPAC name


(8S,9S,10R,11S,13S,14S,17S)-11-hydroxy-17-(2-hydroxyacetyl)-
10,13-dimethyl-1,2,6,7,8,9,11,12,14,15,16,17-dodecahydrocyclopenta[a]
phenanthren-3-one


Also Marvin sketch isn't able to generate structure using the latter UIPAC name

ChemAxon e7b9408ca1

24-11-2011 10:23:31

Hi Alena,


Thank you for your report. In general, several different correct names can be generated for the same structure. I believe this is the case here. We currently generate a bridged name for this structure, while pubchem has the fused name. Could you confirm what is the problem, you would prefer the fused names?


We do support generating some fused names, however in the most complex cases we resort to generating the bridged names. We are planning to extend our support for fused names in all possible cases. This will allow the import of those same names as well.


Best regards,


Daniel Bonniot

User cf4264f752

21-12-2011 16:17:34

Thanks, Daniel!


I'm not sure if we prefer fused names, but for me it seems that they are more common than the bridged ones. At least it's true for cheminformatics tools that we use.


Is the following problem I have related to this issue?


Nothing is found by JChem using the following InChI identifier generated by Symyx Draw:


InChI=1S/C17H11BrFN3O4/c18-10-4-3-9(11(19)6-10)8-21-14(24)12-2-1-5-22(12)17(16(21)26)7-13(23)20-15(17)25/h1-6H,7-8H2,(H,20,23,25)/t17-/m1/s1


but the corresponding structure surely exists in our database. see file Ranirestat.mol


The second InChI generated by Symyx Draw cannot be processed at all


select jc_evaluate('InChI=1S/C17H11BrFN3O4/c18-10-4-3-9(11(19)6-10)8-21-14(24)12-2-1-5-22(12)17(16(21)26)7-13(23)20-15(17)25/h1-6H,7-8H2,(H,20,23,25)/t17-/m1/s1', 'isValid("aromaticity..valence..queryAtom..queryBond")') from dual;


this query produces error


 


ORA-20105: Option without value: >isValid("aromaticity..valence..queryAtom..queryBond")<


ORA-06512: at "JCHEM_GG_EDITORIAL.JCHEM_CORE_PKG", line 34


ORA-06512: at "JCHEM_GG_EDITORIAL.EXEC_FUNC", line 69


ORA-06512: at "JCHEM_GG_EDITORIAL.EVALUATE_FUNC", line 7


 


InChI generated from file Timcodar.mol


 


Search params are 't:d charge:i'


 


Oracle environment: 
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi 
PL/SQL Release 10.2.0.5.0 - Production 
CORE 10.2.0.5.0 Production 
TNS for Linux: Version 10.2.0.5.0 - Production 
NLSRTL Version 10.2.0.5.0 - Production 

JChem owner: JCHEM 
JChem Server environment: 
Java VM vendor: Sun Microsystems Inc. 
Java version: 1.6.0_23 
Java VM version: 19.0-b09 
JChem version: 5.6.0.2 
JChem Index version: 5060000 
JDBC driver version: 11.1.0.7.0-Production

ChemAxon e7b9408ca1

21-12-2011 18:27:52

Yes, bridged names should be generated when possible. We are working on it. It should be a "cosmetic" issue only though (how generated names look) and should not affect searches: if you have structures, you will always be better off doing a structure search, rather than a text search using names.


This is why I think your latest issue in unrelated, I'm asking a colleague to answer it.

User cf4264f752

22-12-2011 08:52:00

I just thought that conversion is sort of bidirectional. I mean if you cannot generate bridged name you shouldn't be able to use it as a query either. Because prior to search you should convert in into structure representation. Am I wrong in this assumption?

ChemAxon e7b9408ca1

22-12-2011 09:12:46

Hi,


If I understand correctly, you are doing a search, where the input is "InChI=1S/C17H11BrFN3O4/c18-10-4-3-9(11(19)6-10)8-21-14(24)12-2-1-5-22(12)17(16(21)26)7-13(23)20-15(17)25/h1-6H,7-8H2,(H,20,23,25)/t17-/m1/s1".


This string can be converted into an internal structure representation, and used for a search. No naming is involved.


From the error message:


Option without value: >isValid("aromaticity..valence..queryAtom..queryBond")<

It looks to me like the query syntax is the problem, not the structure. A colleague will help with that.


Best regards,


Daniel


PS: although it is beside the point, it happens that structure to name CAN generate a bridged (and spiro) name for that structure:


(3R)-3'-[(4-bromo-2-fluorophenyl)methyl]-5-hydroxy-2,3',4,4'-tetrahydro-2'H-spiro[pyrrole-3,1'-pyrrolo[1,2-a]pyrazine]-2,2',4'-trione

User cf4264f752

22-12-2011 10:08:34

It's unlikely that query syntax is the problem because query for benzene with the same params works just fine




select jc_evaluate('InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H', 'isValid("aromaticity..valence..queryAtom..queryBond")') from dual


and returns 1

ChemAxon a3d59b832c

22-12-2011 10:18:45

Hi Alena,


 


I have checked Ranirestat.mol above. Please note that the InChI string you quote is a different structure. It may be due to InChI's canonicalization. See below. (I have highlighted the different parts.) This is why JChem duplicate search did not find it.


 



 


However, if you keep the second line of the InChI (AuxInfo), it seems to me that the original structure is read back.


InChI=1S/C17H11BrFN3O4/c18-10-4-3-9(11(19)6-10)8-21-14(24)12-2-1-5-22(12)17(16(21)26)7-13(23)20-15(17)25/h1-6H,7-8H2,(H,20,23,25)/t17-/m1/s1
AuxInfo=1/1/N:16,15,21,24,11,22,6,17,19,25,20,10,3,14,2,9,5,26,23,1,12,8,7,18,4,13/it:im/rA:26NCCOC.eCONCCCNOCCCCOCCCCFCCBr/rB:s1;s1;d2;s2;s3s5;d3;s5;s5;s8;s8;s9;d9;s10s12;d-10;d-11s15;s12;d14;s17;s19;d+19;d-20;s20;s21;s22d-24;s25;/rC:20.7937,10.0436,0;19.2873,10.3637,0;21.5637,11.3773,0;18.1429,9.3333,0;19.1264,11.8953,0;20.5331,12.5218,0;23.0951,11.5382,0;19.1264,13.4353,0;17.7927,11.1253,0;17.7927,14.2053,0;20.2707,14.4657,0;16.459,11.8953,0;17.7927,9.5853,0;16.459,13.4353,0;18.1128,15.7117,0;19.6444,15.8726,0;15.1252,11.1253,0;15.1252,14.2053,0;13.7917,11.8953,0;12.4579,11.1253,0;13.7917,13.4353,0;11.1242,11.8953,0;12.4579,9.5853,0;12.4579,14.2053,0;11.1242,13.4353,0;9.7907,14.2053,0;


 


We will check what is wrong with the jc_evaluate line above.


 


Best regards,


Szabolcs


 

ChemAxon aa7c50abf8

22-12-2011 11:12:36

Hi Alena,


Due to a known bug, which will be fixed in JChem version 5.8, errors occurring in JC_EVALUATE are identified with the wrong error messages. Until 5.8 is released, the work around is to use JC_EVALUATE_X. (Notice that the syntax is slightly differrent with JC_EVALUATE_X:


select jc_evaluate_x('InChI=1S/C17H11BrFN3O4/c18-10-4-3-9(11(19)6-10)8-21-14(24)12-2-1-5-22(12)17(16(21)26)7-13(23)20-15(17)25/h1-6H,7-8H2,(H,20,23,25)/t17-/m1/s1',
'chemTerms:isValid("aromaticity..valence..queryAtom..queryBond")') from dual;

)


Let us know what error message JC_EVALUATE_X gives.


Best Regards,


Peter

User cf4264f752

22-12-2011 12:29:28

Thanks, Peter 


JC_EVALUATE_X works without producing an error. But the following query still fails 


SELECT unit_id FROM chem_structs where jc_compare(structure, 'InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3', 't:d charge:i') =1;



ORA-29902: error in executing ODCIIndexStart() routine


ORA-20101: No structure found.


User cf4264f752

22-12-2011 14:12:25










Szabolcs wrote:

I have highlighted the different parts.



 


Did you mean additional AuxInfo part? Cause otherwise they are the same.


Why then this query


 


SELECT unit_id FROM chem_structs where jc_compare(structure, 

substr((select jc_molconvert(structure, 'inchi') from chem_structs where unit_id = 721),

0,instr((select jc_molconvert(structure, 'inchi') from chem_structs where unit_id = 721), 'AuxInfo') - 2)

, 't:d charge:i') =1;

returns 721 (id for benzen structure) see benzene.mol


and


SELECT unit_id FROM chem_structs where jc_compare(structure, 

substr((select jc_molconvert(structure, 'inchi') from chem_structs where unit_id = -1650449181),

0,instr((select jc_molconvert(structure, 'inchi') from chem_structs where unit_id = -1650449181), 'AuxInfo') - 2)

, 't:d charge:i') =1;

returns nothing 


-1650449181 is id for  Ranirestat.mol

ChemAxon a3d59b832c

22-12-2011 14:29:23

Sorry, the image did not go through. Here it goes again.


 


The point is that the molecule changes if only the first line of the generated inchi string is used.

ChemAxon aa7c50abf8

22-12-2011 16:53:28

Alena,


JC_EVALUATE_X works without producing an error. But the following query still fails 

SELECT unit_id FROM chem_structs where jc_compare(structure, 'InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3', 't:d charge:i') =1;



ORA-29902: error in executing ODCIIndexStart() routine

ORA-20101: No structure found. 

With this different structure, JC_EVALUATE_X does produce an error (the same root error as with your SELECT statement):


SQL> select jc_evaluate_x('InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3', 'chemTerms:isValid("aromaticity..valence..queryAtom..queryBond")') from dual;
select jc_evaluate_x('InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3', 'chemTerms:isValid("aromaticity..valence..queryAtom..queryBond")') from dual
*
ERROR at line 1:
ORA-29532: Java call terminated by uncaught Java exception:
chemaxon.jchem.cartridge.oresident.nonidxscan.NonIdxScanException:
RemoteException occurred in server thread; nested exception is:
java.rmi.RemoteException: Problem importing query structure:
InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(3
3-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19
-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3
: chemaxon.formats.MolFormatException: No structure found.
ORA-06512: at "PK5602.JCHEM_CLOB_PKG", line 34
ORA-06512: at "PK5602.EXEC_FUNCV", line 42
ORA-06512: at "PK5602.EVALUATEX_FUNC", line 7 

This is a problem importing the InChi structure. I'll have my colleagues look at this structure and get back to you on this ASAP.


Best Regards,


Peter

ChemAxon d26931946c

23-12-2011 12:21:22

Hi Alena,


According to the current Inchi definiton, the standard inchi should start with "InChI=1S/" and non-standard inchi should start with "InChI=1/"


If  you replace the "R" to "S" in


 
InChI=1R/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3


 to


 
InChI=1S/C43H45ClN4O6/c1-47(43(51)41(50)34-26-38(52-2)42(54-4)39(27-34)53-3)37(33-12-14-35(44)15-13-33)28-40(49)48(29-32-8-6-5-7-9-32)36(16-10-30-18-22-45-23-19-30)17-11-31-20-24-46-25-21-31/h5-9,12-15,18-27,36-37H,10-11,16-17,28-29H2,1-4H3


the import works fine.


 


Can you tell us where did you get this Inchi code?


 


Best regards,


Peter

User cf4264f752

24-12-2011 16:23:40










Szabolcs wrote:

The point is that the molecule changes if only the first line of the generated inchi string is used.



 


Oh, now I see. That is rather complicated issue concerning InChI...


So I guess to make such search successfull we should turn tautomerSearch on and to do so we should set 'tdf:y' option for the jchem index, am I right?

User cf4264f752

24-12-2011 16:51:23










gezapeti wrote:

Can you tell us where did you get this Inchi code?



Hi, Peter


I was trying to investigate this question and it seems that it was just a mistake on our chemists side( I know it sounds awkward, but they just cannot remember where did they get this code from. However, they are insisting that this code is allright because Symyx draw can generate structure from it. So I think we should drop this issue from now on. Sorry for confusion.

ChemAxon a3d59b832c

27-12-2011 16:57:12










alena wrote:

Oh, now I see. That is rather complicated issue concerning InChI...


So I guess to make such search successfull we should turn tautomerSearch on and to do so we should set 'tdf:y' option for the jchem index, am I right?





Yes,
that is correct. One additional warning is that in case of tdf:y index
parameter and duplicate search, the default setting is tautomer search. You should explicitly switch it off if you wanted non-tautomer search.


However, I think that inchi standardization is
not just about tautomer transformations. The safest is probably retaining
both lines of the generated inchis and using the full information for
molecule transfer. Even better may be to stick to good old molfiles, and use inchi just as a field alternative.


 


Best regards,



Szabolcs