Problems with jc_equals and jc_compare

01-05-2007 14:52:14

And this returns zero.

We will answer the remaining questions soon.

Thanks

Peter

01-05-2007 20:10:09

I tried it with both your query and the more selective O=Cc1ccccc1 on the NCI dataset with JChem 3.2.5. Perfect search has been 1.5 to 2 times faster than exact search with either queries. (It was not as much faster with the more selective query as with yours.) (70-80 ms with exact search and 40 ms with perfect search.)

Thanks

Peter

03-05-2007 16:18:12

In general these search types may perform the quick pre-filtering (screening) phase on different principles.

In perfect search mode the indexed cd_hash column is used to get a list of possible duplicates.

During exact search screening is performed with fingerprints, these are stored in the memory (structure cache).

From JChem version 3.1.4 the hash code is also used for exact search whenever applicable (no query features present in the query structure).

Screening with the hash code is expected to be a more efficient filter (less candidates left for graph search).

Is your version older than 3.1.4 ?

If

- there is no index on the column cd_hash (it was removed by hand)

- the RDBMS (or the connection to it) is very slow for some reason

then it may provide an explanation.

Best regards,

Szilard

04-05-2007 11:09:56

I am sorry about the delay in the answer.

There is a possibility that the problem is with the smiles import (stereochemistry next to the ring closure), but our colleague who develops this is currently unavailable, so he cannot confirm. We will be able to give a definite answer by Monday.

07-05-2007 15:13:26

In the next major release (3.3) we will solve that all small ring(<=7) double bonds will be always treated as CIS, regardless of how they are specified in the input. (This is the real-life scenario for all these rings due to the constraints of the ring geometry.)

We are checking if there is an immediate possibility using Standardizer that can be helpful for you.

Szabolcs