trouble with tautomerize in std_config in index

User f05f6b8c05

18-11-2012 06:08:22

Hi,


I'm trying to index a table to include tautomerize in the std_config parameters.  Please see attached file with commands to recreate my problem.


We are using jchem 5.11.2.


Thank you very much for any help.


Best regards,


Andrew

ChemAxon aa7c50abf8

19-11-2012 10:50:20

Hi Andrew,


Thank you for reporting this issue. We are looking into it.


Are you sure about the aromatization method 'd'. I can't find it in the documentation: http://www.chemaxon.com/jchem/doc/user/StandardizerConfiguration.html#actionstring


Thank your for your patience.


Peter

User f05f6b8c05

19-11-2012 13:26:50

Thanks, Peter.  Am happy to use a different aromatization option, if that would be better.  "aromatize:d" does show up a few times on this page:


  https://www.chemaxon.com/jchem/doc/dev/cartridge/index.html


Eg.


"Create index with Daylight-style aromatization:

CREATE INDEX jc_idx_2 ON jchemtest2(structure_col) INDEXTYPE IS jc_idxtype PARAMETERS('std_config=aromatize:d');"

ChemAxon aa7c50abf8

19-11-2012 15:20:35

Sorry. We will fix the JCC documentation. Here are the official options for JChem 5.11: http://www.chemaxon.com/jchem/doc/user/StandardizerConfiguration.html#actionstring .


Peter

User f05f6b8c05

20-11-2012 01:27:57

Thanks for the correction .. but even if I juse use aromatize (and not aromatize:d) I still get the same behavior about which I posted originally.

ChemAxon 4a2fc68cd1

20-11-2012 14:48:47

Hi,


We found that this issue is related to the molecule format conversion. When the smiles format is converted to mol format, the stereo information gets corrupted somehow (cis/trans is flipped). We will fix this issue and let you know when this fix will be available.


In fact, you can ignore the format conversion and directly insert the smiles representation to the database and also use the smiles string as query. It would solve the problem. (Unfortunatelly, format conversion may cause information loss or misleading information.)


Anyway, upcoming JChem releases will significantly improve tautomer searches, both in terms of efficiency and in terms of  stereo handling. Once these improvements are available, it will be beneficial to use tautomer search option instead of creating a table with "tautomerize" standardization. The next release, 5.12 will provide the improvements (at least) for duplicate and full structure search types.


Best regards,
Peter

User f05f6b8c05

20-11-2012 18:43:21

Thank you for information.


I only used SMILES in the example so that it would make for a simple script to reproduce the problem.  However, if I use the 2 attached mol files (and load them into the mol_structure column as text without going through SMILES), then I see same problem.  Is this different problem?  Or something I do incorrectly?


Thanks,


Andrew

ChemAxon 61b4fee994

21-11-2012 12:43:43

Hi,


We tried your example with the attached files and we received both hits for the query you sent. Could you recreate the scenario and after running the query with no returing hits, could you run the follwing query and send us the returned rows?


Our query:


select * from akidx_jcx,ttest where ttest.rowid=akidx_jcx.rid and ttest.id in (2,4) order by ttest.id;


 


Thank you,


Tamas

User f05f6b8c05

21-11-2012 13:00:50

Hi,


This is very strange .. today ..


select id from ttest where jchem.jc_compare(mol_structure, (select mol_structure from ttest where id='2'), 't:f stereoSearchType:e')=1;


.. returns (only) id 2 (yesterday, this query returned no hits, I promise)


[see attached 2.txt that contains your query results]


 


However, this query ..


select id from ttest where jchem.jc_compare(mol_structure, (select mol_structure from ttest where id='4'), 't:f stereoSearchType:e')=1;


returns no hits


[see attached 4.txt that contains your query results]


 


Thanks for the help.  Please let me know if any other info would be useful.


 


Andrew

ChemAxon 61b4fee994

21-11-2012 13:52:34

Thank you, it shows the same for us now. We will try to fix it soon.


 


Best regards,


Tamas

User f05f6b8c05

21-11-2012 14:15:19

Excellent!  I am glad we see same thing .. smile.


Thank you

ChemAxon 4a2fc68cd1

21-11-2012 14:27:19

Hi,


We found the reasons of these problems:


1. The difference between the results for the 2 and 4 queries (1 vs 0 hit) is due to a bug in tautomerization that we have already fixed. The next major version, 5.12 will contain this fix.


2. The difference between 'aromatize' and 'addExplicitH..tautomerize..aromatize' standardization (2 vs 1 hit) is due to inproper placement of explicit hydrogens in the latter case. You can check this as follows:



In fact, placing explicit hydrogens correctly is a complex task and in particular cases, such issues may occur. Do you really need this when you insert your structures into the DB? Would it be possible to use explicit hydrogen addition only when the structures are displayed and not in DB?


Best regards,
Peter

User f05f6b8c05

21-11-2012 14:35:47

Hi,


Thank you for information.


I only use addExplicitH option because of


http://www.chemaxon.com/jchem/doc/user/query_searchoptions.html which say:


"The query must have all implicit hydrogens set for proper perception of the possible tautomers."


I know that statement apply to query, but I thought would be better to include in index as well.  Is addExplicitH in index not required?


What is best way to ensure that query have "all implicit hydrogens set"?


Thank you.

ChemAxon a3d59b832c

21-11-2012 15:03:52

Hi Andrew,


 


http://www.chemaxon.com/jchem/doc/user/query_searchoptions.html which say:


"The query must have all implicit hydrogens set for proper perception of the possible tautomers."


I know that statement apply to query, but I thought would be better
to include in index as well.  Is addExplicitH in index not required?


What is best way to ensure that query have "all implicit hydrogens set"?


 


No, you are not required to use the addExplicitH action.


What we meant in the mentioned help page was that our tautomer module needs the Hydrogen information on the molecule. Explicit or implicit Hydrogens would both be suitable. In fact, now all of our input modules add missing implicit Hydrogens on the molecule automatically. (It was not true at the time of writing.) Marvin sketch also updates the implicit Hydrogen information automatically during sketching.


So you do not need to do anything to ensure this.


We will revise that documentation page to avoid further misunderstanding. Thank you for pointing this out for us!


 


Best regards,


Szabolcs


 

User f05f6b8c05

21-11-2012 15:13:58

Thank you for explanation .. having to do nothing is always a good solution .. smile.


Best,


Andrew