Substructure search performance tuning

14-04-2006 11:06:33

If necessary you may try different fingerprint settings and check the performance again.

Please see the documentation about the options:

http://www.chemaxon.com/jchem/doc/user/fingerprint.html

The easiest way to make a fingerprint less dark is to increase the length.

Please note that it also increases the cache size.

Limiting the number of hits

By using "JChemSearch.setMaxResultCount()" one can limit the number of hits if you do not need them all.

(e.g. it makes no sense to provide a hit list of millions of compounds for browsing)

http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/JChemSearch.html#setMaxResultCount(int)

Since search time is linear to the number of screened structures (which is approximately linear to the number of screened), the search time will remain about constant even for huge databases if you use this feature.

(because the screening time is usually negligible)

Structural Keys

Although our fingerprint is a chemical hashed fingerprint, we also have a similar concept to MDL keys which can be handy in certain cases.

The chemical hashed fingerprint can be extended with Structural Keys.

Each key represents a query structure and the bit is set to 1 if the target contains it.

If a query structure perfectly matches one of the structural keys it is recognized at the start of the search, and substructure results are coming almost instantaneously.

(we only have to check if the bit is set)

This is useful if you have a fix set of very frequently used queries (e.g. functional groups).

Currently these keys can only be specified at table creation, and cannot be changed afterwards, but we are planning to work on this.

(it's a relatively new thing)

For more information please visit:

http://www.chemaxon.com/jchem/doc/admin/#create