User 773d472e7f
06-05-2013 15:27:21
Dear CHEMAXON:
I have an issue/question regarding the performance of Substructure searching:
JCHEM Molecule database on a Virtual Private Server.
15 Million + molecule table.
Linux Ubuntu vServer / MySQL
There are some duplicate structures
There are some Empty molecules
Duplicate searching returns the result within 1 or 2 seconds eg:
./jcsearch -t:d -q "CCOc1cc(C=O)ccc1OCCCN(CC)CC" DB:AKOS_MOLTABLE
CCOC1=C(OCCCN(CC)CC)C=CC(C=O)=C1
CCOC1=C(OCCCN(CC)CC)C=CC(C=O)=C1
Substructure searching takes a very long time eg the following takes over 1 hour to return the results:
root@v14908:~/ChemAxon/JChem/bin# ./jcsearch -Xmx4000M -server -t:s -q "CCOc1cc(C=O)ccc1OCCCN(CC)CC" DB:AKOS_MOLTABLE
CCOC1=C(OCCCN2CCCCC2C)C=CC(C=O)=C1
CCOC1=C(OCCCN2CCC(C)CC2)C=CC(C=O)=C1
CCOC1=C(OCCCN2CCN(CCO)CC2)C=CC(C=O)=C1
CCOC1=C(OCCCN2CC(C)CC(C)C2)C=CC(C=O)=C1
...etc
Subsequent searches also take the same amount of
time.
My understanding is that the first search can take a
long time to load the cache but subsequent searches should be much faster.
This is not happening in this case. Please would you
advise what I should look at to improve the performance, for example checking
the state of the cache.
I am looking forward to your reply
Kind Regards
Bernard D'Alwis