User 8139ea8dbd
24-10-2011 15:30:22
Using the cartridge, substructure search with
[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]
is very slow.
If you remove [H] and use (s*), performance is normal. It seems like explicit hydrogen is not handled in an optimal way.
ChemAxon 8407015329
26-10-2011 11:27:59
Hi,
We started to check the possible scenarios(default settings with latest JCB) for the issue you experienced. In the meanwhile could you please send us some additional data such as:
- what version are you using
- what kind of table are you searching and how many structures are in it
- did you use any specific search options
- what was the exact command executed
Regards,
Vencel
User 4cd5052280
03-11-2011 16:45:22
JCHEM_CORE_PKG.GETENVIRONMENT()
------------------------------------------------------------
Oracle environment:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 -
64bi
PL/SQL Release 10.2.0.3.0 - Production
CORE 10.2.0.3.0 Production
TNS for Solaris: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production
JChem Server environment:
Java VM vendor: Sun Microsystems Inc.
Java version: 1.6.0_26
Java VM version: 20.1-b02
JChem version: 5.5.1.0
JChem Index version: 5050100
JDBC driver version: 11.1.0.7.0-Production
SQL> select count(*) from cpd;
COUNT(*)
----------
5482273
select cpd_sid from gnf_imc.CPD where jc_compare(jc_smiles, '[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]', 't:s')=1;
ChemAxon 8407015329
04-11-2011 19:51:30
Hi,
Unfortunately we were unable to reproduce the slowdown even with that many structures in the target table. Only experienced a ~1.2 times slowdown which is reasonable considering the amount of hydrogen atoms in the query structure can affect the atom by atom search.
Is this issue you experiencing reproducable every time you search? What is the slowdown factor you experience? Is the issue present if you query only a part of the table (using a filter query for example)?
Regards,
Vencel
User 8139ea8dbd
04-11-2011 21:36:26
This looks like a total mystery. The SQL basically never returns (something stalled in the backend)
I first try
select cpd_sid,jc_smiles from cpd where jc_compare(jc_smiles, '[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]', 't:na')=1
And I get 461 structures that pass the initial screening
Then for each of the 461 candidate, I did
select jc_compare(<smiles>, '[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]', 't:s') from dual
and it went through all of them without problem.
But if I run
select cpd_sid,jc_smiles from cpd where jc_compare(jc_smiles, '[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]', 't:s maxTime:1000 maxHitCount:5')=1;
it never returns (maxTime does not have effect).
(Side note: when I do
select * from cpd where jc_compare(jc_smiles, '[H]C1=C(C)C2=C([H])C(=C([H])N=C2N1)C1=C([H])C([H])=C([H])C([H])=C1[H]', 't:na haltOnError:y')=1;
I got an exception saying: ORA-29902: error in executing ODCIIndexStart() routine, ORA-20102: Invalid search option: error: uknown option name: haltonerror Use -h for help. ORA-06512: at "JCHEM_CART.JCHEM_CORE_PKG", line 34 ORA-06512: at "JCHEM_CART.JC_IDXTYPE_IM" line 483 ORA-06512: at line 1)
It seems haltOnError is no longer a valid option, maybe the document needs to be updated?)
What do you suggest we do next?
Thanks
ChemAxon aa7c50abf8
04-11-2011 23:24:55
Would it be possible to temporarily increase the log level by adding the following lines in the jchem/cartridge/conf/logging.properties file:
chemaxon.jchem.db.level = FINEST
chemaxon.jchem.cartridge.level = FINEST
? For these changes to take effect, restarting the JChem Cartridge server is currently required.
It would be also very helpful, if you could execute the following command a couple of times at a few seconds intervals while the problematic search is running/hanging:
bash server.sh thread-dump 2>> thread-dump.log
It needs to executed in the same directory where the JChem Cartridge server is started and stopped (i.e. in jchem/cartridge).
Please, could you send the log files (from jchem/cartridge/logs) as well as the thread-dump.log file to pkovacs at chemaxon dot com?
(The haltOnError search option was added in 5.6.0.0. It was not available in 5.5.x. The ineffectiveness of maxTime may be related to the problem at hand.)
Thank you,
Peter
ChemAxon aa7c50abf8
05-11-2011 13:09:35
PS:
If the FINEST log level results in excessively large log files, FINER should also do the job in this first round.
ChemAxon 8407015329
10-11-2011 10:15:59
Hi,
We were finally successfull to reproduce the issue. It is caused by a faulty query enumeration in the database search mechanism, thus the query with the explicit H atoms is about 20 times slower.
This issue has been fixed in version 5.7, which is due for release in a couple of days. If the workaround with s* is not a suitable alternative for you we suggest you upgrade to version 5.7.
Thanks for all the help in detecting the issue,
Vencel