User 93ffa33d02
20-02-2015 15:20:52
We use similarity search in case if others “more exact”
search types were unsuccessful. We use search with JChemBase for Java.
What kind of options we can use to speed up similarity
search? Actually we need only small set of most similar compounds (from 10 to
1000 compounds) but search will be performed for really large Database (more than
20 million of compounds). According to documentation: we cannot use option
“setMaxResultCount” because “In case of similarity searches, the full search is
performed and the maxResult Count most similar results will be given back. In
this case this option does not mean speedup.” Is it still true?
Options “setFilterQuery” can increase speed? Filtering will
be done before search or after? So if for 40 million of compounds we perform
SQL to filter cd_id and leave only 20 million. Will be similarity search quickly
than for 40 million?
Currently we use JChem 6.2.1 but going to update to the latest version near future.