Filter Query Performance

05-05-2006 18:20:49

and it takes 26 seconds. The filter query itself takes only 32 milliseconds and returns 3000 rows. STRUCTURE table contains about 2.5M rows. JChem version is 3.1.5.

Is there a way to speed it up?

08-05-2006 10:24:20

where "indexTable" is <index-name>_jcx

3) A structure search is executed using the JChem Java API on the cd_ids returned by the previous query

3a) "indexTable" is loaded into the structure cache, if not yet loaded

3b) A pre-screening is performed based on fingerprints

3c) A graph search is performed on the structures left over from the fingerprint pre-screening

4) The cd_ids of the hits are returned from JChem Streams to Oracle

5) The cd_ids are converted to rowids of the base table (in your case: of the table structure)

6) The rowids are returned to the Oracle Execution engine which internally processes them further.

In an attempt to reproduce your case, I set up a structure table "structure" with 2.6 million smiles and having a primary key column called cpd_sid. cpd_sid of type numeric(10,0) with numbers constantly incremented by 1 starting with 1 up to 2.6 million. I slightly reformulated your query as follows:

08-05-2006 17:56:39

The time becomes under a second if I run exactly same query several times, but only the first run corresponds to real-life scenario.

With my own query however, it still takes longer first time:

08-05-2006 20:43:32

It is this first time execution of your query where your entire SQL statement (with jc_compare) takes ~26 seconds, isn't it?

How much does it take for the entire SQL statement (with jc_compare) to execute for the second time -- when the JChem Java API search executes in 556 ms.

Thanks

Peter

08-05-2006 22:45:59

Yes, that's correct. It is not 26 seconds every time, it ranges from 15 to 40 seconds depending on the value of p.screen_id in my example (the number of returned rows varies between 2000 and 6000). I cannot accurately reproduce exactly the same situation every time because the query is already cached in Oracle.

09-05-2006 06:12:33

where :a is a value which you think is meaninful for the test.

How long does this query take when run for the first time. Is it much slower than the enclosed filter query?

Thanks

Peter

11-05-2006 18:03:56

Total time of executing the same query is 15501 ms.

11-05-2006 20:21:05

The 736 ms does not seem to be very significant compared to the total time of 13818 ms, does it? (The time to execute the filter query is actually included in the 13818 ms of the JChem Java API search.)

Would it be possible to run a few tests with my example query using a literal screen_id (no variable binding)? Does it make a huge difference? (JChem Cartridge is, of course, unable to substitute bind variables for literal values in the text of the filter queries.)

Thanks

Peter

11-05-2006 21:17:57

No, 736 is acceptable, although imho strange for such a simple query.

13 seconds is a little too much :)

Using variable binding in my filter query does not noticeably affect the performance.

12-05-2006 19:12:43

where <literal-screen-id> is a value with which you experienced slow searches. Please, try to run the explain plan so that the resulting plan matches the closest possible the plan used during the substructure search in Tomcat. In other words: try to eliminate the factors susceptible to cause a change in the execution plan. Please, review the "How Execution Plans Can Change" section of the chapter "Using EXPLAIN PLAN" in the "Oracle Database Performance Tuning Guide and Reference" for a list of the possible factors. Please, post the resulting plan possibly formatted (with select * from table(dbms_xplan.display()) for example).

It would also be interesting to know whether you use table statistics or not.

Thank you,

Peter