Two structure search issues

User 10a23c54c1

30-11-2005 19:53:19

I want to report about two problems with the structural searches using JChem cartridge





1) The exact search is very slow for small strutures. For instance, if I try the query





select * from structure where jc_compare(jc_smiles,'c1ccccc1','t:e')=1





it takes about 2 minutes to return. Here is the output from JChemstreams servlet:





Wed Nov 30 11:43:58 PST 2005


Search mode: EXACT


Structure table: LDDB_WAREHOUSE.STRUCTURE$JC_IDX_JCX


Query: c1ccccc1


Screened: 1799454


Hits: 0


Total time: 103320 ms Screening: 240 ms


Processing threads: 2


Current / peak / maximum searches per minute: 2 / 3 / Unlimited





The perfect or the similratiy searches however are very fast.








2) When I try to search using the csmol format for some reason the stereochemistry is being ignored.

ChemAxon 9c0afc9aaf

01-12-2005 09:41:37

Dear Hayk,





1.


- Similarity search is very fast, since it does not require graph search, it just calculates the Tanimoto distance of binary fingerprints.





- Perfect search is also very fast, since it uses hash code (cd_hash column) to determine if two molecules are equivalent. This produces very few false screened hits, therefore the number of graph search calls are minimal.





- Exact search uses the same screening algorithm, as the Substructure search : the target fingerprint must "contain" the query fingerprint.


In you case this results in a large number of screened structures (e.g. benzene is a common substructure), and the CPU intensive graph search is run for all of these structures.





We have discussed this, and concluded that if there are no query features (query atom or bond types) in the query, we can also use hash codes in the screening phase for Exact search. This should greatly reduce the number of screened structures, and improve the performance significantly.


This improvement will be available in the near future, probably in the next JChem release.





2. We could not reproduce the problem yet.


Could you provide us with some examples ?


Please also let us know which JChem version are you using.





Kind regards,





Szilard

ChemAxon 9c0afc9aaf

14-12-2005 17:33:58

Dear Hayk,





JChem 3.1.4 has been released and available for download.


In this version the EXACT search uses the hash code when possible (there are no query atom or bond types in the query).





Regards,





Szilard