Chemical searching on non-Jchem tables

User 9ec4593bcc

06-06-2007 21:26:46

I am opening up a new post that is a continuation of:





http://www.chemaxon.com/forum/viewtopic.php?p=12043#12043





My general questions are regarding chemical structure searching in non-Jchem tables.





SQL*Plus: Release 10.2.0.1.0 - Production on Wed Jun 6 07:41:36 2007





Copyright (c) 1982, 2005, Oracle. All rights reserved.








Connected to:


Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production


With the Partitioning, OLAP and Data Mining options








JCHEM_CORE_PKG.GETENVIRONMENT()


--------------------------------------------------------------------------------





Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod


PL/SQL Release 10.2.0.1.0 - Production


CORE 10.2.0.1.0 Production


TNS for 32-bit Windows: Version 10.2.0.1.0 - Production


NLSRTL Version 10.2.0.1.0 - Production


NLSRTL Version 10.2.0.1.0 - Production


JChem version in the database: 3.2.6


JChem version in the Tomcat server: 3.2.6


java.vm.version: 1.6.0_01-b06


java.vm.vendor: Sun Microsystems Inc.


Apache Tomcat/6.0.1





JCHEM_CORE_PKG.GETENVIRONMENT()


--------------------------------------------------------------------------------





Major JDBC version in Tomcat: 10


Minor JDBC version in Tomcat: 2








Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Pr


oduction


With the Partitioning, OLAP and Data Mining options


E:\JChem\cartridge>








JChem version: 3.2.6


JChem Streams version: 3.2.6











SQL> desc tbl_chembridge;


Name Null? Type


----------------------------------------- -------- ----------------------------





MOLECULEID VARCHAR2(3072)


HACC NUMBER


IDNUMBER VARCHAR2(3072)


HDON NUMBER


TPSA NUMBER


CLOGP NUMBER


RB NUMBER


SMILES CLOB


SUPPLIER VARCHAR2(3072)


COLLECTION VARCHAR2(3072)


CATALOG_DATE VARCHAR2(3072)


ENTRY_DATE VARCHAR2(3072)











The database was indexed using the following command:





CREATE INDEX jc_idx_chemb ON tbl_chembridge(smiles) INDEXTYPE IS jc_idxtype


PARAMETERS('TABLESPACE=USERS,STORAGE=INITIAL 2M,std_config=aromatize:d');











The following SQL statements execute:





select count(*) from tbl_chembridge where jc_compare(SMILES, 'CN1C=NC2=C1C(=O)N(C)C(=O)N2C', 't:s') = 1;


select count(*) from tbl_chembridge where jc_compare(SMILES, 'CN1C=NC2=C1C(=O)N(C)C(=O)N2C', 't:s maxHitCount:50') = 1;








but when I use jc_tanimoto, jc_contains, jc_equals, jc_dissimilarity...errors occur








SQL> select count(*) from tbl_chembridge where jc_contains(SMILES, 'CN1C=NC2=C1C(=O)N(C)C(=O)N2C') = 1;


select count(*) from tbl_chembridge where jc_contains(SMILES, 'CN1C=NC2=C1C(=O)N(C)C(=O)N2C') = 1


*


ERROR at line 1:


ORA-29900: operator binding does not exist


ORA-06553: PLS-307: too many declarations of 'JC_CONTAINS' match this call





Or the error for jc_tanimoto:





ORA-06553: PLS-307: too many declarations of 'JC_TANIMOTO' match this call








What is the best searching function to use on CLOB fields containing molecular structures?


If jc_contains or jc_tanimoto works on Clob fields, could you please give an example?


Is performance faster searching BLOB fields on a table created by Jchem or a non-jchem cartridge seaching CLOB fields?





Thanks

ChemAxon aa7c50abf8

07-06-2007 08:44:12

JC_COMPARE(<target>, <query>, 't:s') is exactly the same as JC_CONTAINS(<target>, <query>).





JC_COMPARE(<target>, <query>, 't:t') is exactly the same as JC_TANIMOTO(<target>, <query>).





(You haven't asked about JC_EQUALS, but it is the same as JC_COMPARE with the 't:p' option.)





So I suggest to use JC_COMPARE with the appropriate search type option.





The general rule is that operators have the same type for the target structure parameter as for the query structure parameter in their bindings:


Code:
(VARCHAR2, VARCHAR2)


(CLOB, CLOB)


(BLOB, BLOB)



The underlying assumption being that


(a) the column type used depends primarily on the structure format (Molfile, SMILES,...) used;


(b) the same structure format is typically used for the target structure and the query.





JC_COMPARE is an exception: it also has an additional (CLOB, VARCHAR2, VARCHAR2) binding for the convenience of evaluators of JChem Cartridge using sqlplus.





Strictly speaking, this issue is not limited to non-JChem tables. Recent versions of JChem Base allow to use CLOB type for the cd_structure column of JChem tables as well (though this feature may not be currently supported specifically by IJC)





I hope this helps,


Peter

User 9ec4593bcc

07-06-2007 15:05:33

Thanks for the explaination and all the assistance.