Comparing results from jc_compare and jcf_tanimoto

User 952e1d9361

18-12-2009 23:21:46

Hello,


I wonder if someone could help explain something to me.  We are using jc_compare like so :


create table t as


select structure


from table


where jc_compare(structure, 'CC1=CC=CC=C1', 't:t simThreshold:0.75') = 1;


which is returning, in our case, 10 results.


Then running :


select structure, jcf_tanimoto(structure, 'CC1=CC=CC=C1') as sim


from t;


is returning figures of less than .75 which we used to generate the data?


How can this be?  Am I missing something obvious?


Thanks,


Steve


PS - this is JChem details :


 


Oracle environment:                                        

Oracle Database 10g Release 10.2.0.4.0 - 64bit Production  

PL/SQL Release 10.2.0.4.0 - Production                     

CORE 10.2.0.4.0 Production                             

TNS for Linux: Version 10.2.0.4.0 - Production             

NLSRTL Version 10.2.0.4.0 - Production                     

                                                           

JChem Server environment:                                  

Java VM vendor: Sun Microsystems Inc.                      

Java version: 1.6.0_13                                     

Java VM version: 11.3-b02                                  

JChem version: 5.2.0                                       

JChem Index version: 5020005                               

JDBC driver version: 11.1.0.7.0-Production   



Oracle environment:                                      


Oracle Database 10g Release 10.2.0.4.0 - 64bit Production


PL/SQL Release 10.2.0.4.0 - Production                    


CORE 10.2.0.4.0 Production                             


TNS for Linux: Version 10.2.0.4.0 - Production             


NLSRTL Version 10.2.0.4.0 - Production        


 


JChem Server environment:                                  


Java VM vendor: Sun Microsystems Inc.                      


Java version: 1.6.0_13                                     


Java VM version: 11.3-b02                                  


JChem version: 5.2.0                                      


JChem Index version: 5020005                               


JDBC driver version: 11.1.0.7.0-Production

ChemAxon aa7c50abf8

19-12-2009 14:03:19

Hello,


I am attaching a very simple script which I created in an attempt to reproduce the problem. (I am also attaching its output.) Please, could you have a look and modify the script so that it reproduces the problem -- because in its current form my script produces a result which appears to be consistent with the documentation. If confidential data is involved, you can send it to me by e-mail using the pkovacs at chemaxon dot com address.


Thanks

User 952e1d9361

19-12-2009 18:00:04

Thanks Peter, I have emailed the files to you.


Steve

ChemAxon aa7c50abf8

20-12-2009 16:37:07

Steve,


The crux of the problem turned out to be the fingerprint settings associated with the index on the first table being different from the default fingerprint settings. Under this circumstance, the following criteria should be met in order to get consistent similarity results when searching in another table :


1. The other (second) table should be also indexed and the fingerprint settings associated with both indices should be the same. (Different fingerprint settings will result in different similarity values for the same target/query structure pair.)


2. Instead of the function jcf.tanimoto (or jcf_tanimoto which is deprecated), the operator jc_tanimoto has to be used to pick up the fingerprint settings associated with the second table's index. (Functions don't know about indices. Operators do.)


Peter