Need instructions for searching Tautomers.

User 7f33ec9a5c

03-11-2012 23:14:36

Hi,  I need help finding a concrete example of how to do an indexed tautomer search using JCart. All I can find is scattered bits and pieces.  If you could read through this post, it may help assess how difficult the documentation is to follow, and as you read what I did, I hope you will see  how the lack of clear examples for specific cases really makes it hard to get anything done without ChemAxon support.


My understanding is that the most useful function for searching is jc_compare.  So I tried to learn to do a tautomer serch as follows:



Reading from the manual on JC_Compare:

https://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html#jc_compare

"The third argument is an (option list). The following options can be specified. (Detailed information about the semantics of these options can be found in the concepts section and the Query Guide.)"



For the tautomer search option:

https://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html#jc_compare_tautomerSearch

"tautomerSearch:<tautomer-search-mode> 
Specifies tautomer search mode. See the JChem Query Guide.
<tautomer-search-mode> may be one of the following:

• d (the default) behaviour is defined by the search context: tautomer search is turned on for duplicate searches in a tautomer duplicate database table; it is turned off in all other contexts.

• y tautomer search is turned on.

• n tautomer search is turned off.






So I took this to mean that "tautomerSearch:d" was how one would search for tautomers. 


SELECT * FROM STRUCTURE WHERE JC_COMPARE(<JC_IDX COLUMN>, <QUERY SMILES>,'tautomerSearch:d')=1;


... which just returned garbage.

So looking into this further, I found a page, https://www.chemaxon.com/jchem/doc/dev/dbconcepts/index.html#searchtypesandoptions, which sounded like the tautomer search was a search option, and should be placed after an exact match, so I tried: 

SELECT * FROM STRUCTURE WHERE JC_COMPARE(<JC_IDX COLUMN>, <QUERY SMILES>,'t:d tautomerSearch:d')=1;  and that just returned an exact match.  


That did not work, so next I read:

In addition to the above search types, there are many search options that modify structure search behavior. The most important options are listed below, the full and detailed list can be found in the JChem Search Option Guide. Please click the links in the titles for more information.  

so I clicked on https://www.chemaxon.com/jchem/doc/user/query_tautomer_searchoptions.html, and found the clearest page yet which just gave my initial search suggestion at the top 'tautomerSearch:d'


...so I ended up right back where I started.


The a friend suggested that I read through the following three links, and perhaps I needed more indexes to do structure searching, so at this point I'm totally lost and just sitting on my hands.



http://www.chemaxon.com/jchem/doc/dev/dbconcepts/index.html#tautomer_duplicate_table />http://www.chemaxon.com/jchem/doc/dev/cartridge/index.html#tautomer_duplicate_filtering />http://www.chemaxon.com/jchem/doc/user/query_searchoptions.html#tautomerSearch



Any suggestions would be appreciated.  Especially a link to EXAMPLE CODE that works.



ChemAxon 9c0afc9aaf

03-11-2012 23:49:20

 


So I took this to mean that "tautomerSearch:d" was how one would search for tautomers. 

Not really.


First of all specifying the default parameter for anything will not change anything - as a general rule in any API.


We perform a tautomer search by default (= automatically, no paramters set) only if the Tautomer Duplicate Filtering index option (or table option for JCB tables) was specified  during table/index creation. This method uses generic tautomers, and only applicable to duplicate filtering.


However in other cases it is also possible to perform a tautomer search, simply by specifying "tautomerSearch:y".


The relevant section of the JChem Database Concepts (a very useful document) details the various methods:


http://www.chemaxon.com/jchem/doc/dev/dbconcepts/index.html#tautomers


I was mentioning method #1 and #2 above, #1 is the most efficient method for duplicate filtering (t:d), #2 can be used for other searches (substructure, etc)


Best regards,


Szilard

User 7f33ec9a5c

04-11-2012 00:37:01

Your suggestion:


However in other cases it is also possible to perform a tautomer search, simply by specifying "tautomerSearch:y".


does not work, It returns all whole bunch of stuff that is not a tautomer of the query structure.


I don't think you realize how unclear your documentation is.


I just need a concrete example of how to build an index to do a tautomer search, then how to search the index.


Below are some failure modes:


-------------------------------------------------------------------------------------


USING:


CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes')



select s_smiles


  from structure


 where jc_compare(s_smiles,'C(Nc1ccnN1)c2cccs2','tautomerSearch:y') = 1;



returns



Brc1cc(sc1Br)C(=O)Nc2cc(n[nH]2)C3CC3


Brc1cc(sc1Br)C(=O)Nc2cc(n[nH]2)c3ccncc3


etc...


 


Which is obviously not right.


-----------------------------------------------


 


so then I tried the following>


 


CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes tdf=y')

ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine ORA-29532: Java call terminated by uncaught Java exception: java.rmi.ServerException: RemoteException occurred in server thread; nested exception is: java.rmi.RemoteException: java.sql.SQLException: ORA-00922: missing or invalid option

CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes tdf:y')

ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine ORA-29532: Java call terminated by uncaught Java exception: java.rmi.ServerException: RemoteException occurred in server thread; nested exception is: java.rmi.RemoteException: java.sql.SQLException: Missing IN or OUT parameter at index:: 1


 



ChemAxon 9c0afc9aaf

04-11-2012 01:00:11

 


does not work, It returns all whole bunch of stuff that is not a tautomer of the query structure.

In the example above you are performing substructure search (default) with the tautomer "y" option.


In this case tautomers of the query are enumerated, and substructure search is run with all tautomers (including the original).


Your query is a substructure of the targets, so the hits are correct.


 


CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes tdf=y')

CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes tdf:y')

Please see:


http://www.chemaxon.com/jchem/doc/dev/cartridge/index.html#create_index




PARAMETERS('param1=paramvalue1,param2=paramvalue2,...');



You need to use "," and "=" in the parameter list.


I hope this helps.


Best regards,


Szilard

User 7f33ec9a5c

07-11-2012 03:52:33

Hi Szilard,


After our discussion today, I created an index without tdf=y:


CREATE INDEX jc_idx ON structure(s_smiles) INDEXTYPE IS jchem.JC_IDXTYPE PARAMETERS('TABLESPACE=structure_indexes,haltOnError=nf');


Once that was done, I tried :


select s_smiles
from structure
where jc_compare(s_smiles,'C(Nc1ccnN1)c2cccs2','t:d tautomerSearch:y') = 1;


and got the error:


ORA-29904: error in executing ODCIIndexClose() routine
ORA-20105: Tables created without tautomer duplicate filtering ("SENOBASE.JC_IDX_JCX") do not support duplicate search with --tautomer or --tdf option. Please change your table settings to using tautomer filtering(eg. using jcman GUI: File -> Table options -> 'Duplicate search uses tautomers') and recalculate your table for the most accurate tautomer duplication control.
ORA-06512: at "JCHEM.JCHEM_CORE_PKG", line 36
ORA-06512: at "JCHEM.JC_IDXTYPE_IM", line 668
ORA-29903: error in executing ODCIIndexFetch() routine
ORA-20105: Tables created without tautomer duplicate filtering ("SENOBASE.JC_IDX_JCX") do not support duplicate search with --tautomer or --tdf option. Please change your table settings to using tautomer filtering(eg. using jcman GUI: File -> Table options -> 'Duplicate search uses tautomers') and recalculate your table for the most accurate tautomer duplication control.
ORA-06512: at "JCHEM.JCHEM_CORE_PKG", line 36
ORA-06512: at "JCHEM.JC_I


 


..... But as we discussed before, If I create the index  with tdf=y, then it affects the behavior of all other searches.



How do I enable tautomer duplicate searches and regular searches on the same table, in a way that does not require me to specify tautomerSearch:n with every query that should not return tautomers??



Thank you,
~mike

ChemAxon 9c0afc9aaf

07-11-2012 04:50:11

Hi Mike,


I'm sorry, I totally forgot about this limitation ! 


1. You can go back to tdf:y index option - this also gives the best search performance as I have explained.


I would recommend this if you can put up with explicitely disabling tautomer search if not needed for your duplicate search - other searches are not affected.


2. Alternatively you may also use FULL search (t:f) instead of duplicate(t:d) (without the tdf index option)


Since you only have SMILES, which cannot contain query features, in your cases these modes are practicaly the same.


But as we discussed before, If I create the index  with tdf=y, then it affects the behavior of all other searches.

Only the duplicate (t:d) searches. 


Best regards,


Szilard



ChemAxon a3d59b832c

07-11-2012 08:34:32



2. Alternatively you may also use FULL search (t:f) instead of duplicate(t:d) (without the tdf index option)

Since you only have SMILES, which cannot contain query features, in your cases these modes are practicaly the same.

 


One extra warning here: if you would like to simulate duplicate match with full structure search, you will also need to set other search options into "exact" mode.


For example: "t:f charge:e radical:e isotope:e stereoSearchType:e vagueBond:n"


See: http://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html#jc_compare


 


Best regards,


Szabolcs


 

User 7f33ec9a5c

08-11-2012 06:22:39

Thank you both.  Those explanations helped alot.  Everything is working as expected now.


Maya pointed out that t:ff also returns tautomers, but we'd sort of expect this behavior, because it's a duplicate match on a fragment.


~mike