Hi Imants,
I tried pubchem. In JChem Base, full searching with your latest R-query took 660s on our test server, that is 11 min. On the other hand, substructure search took 479s = ~8 min, without the time needed for returning the results. (346000 hits)
Search mode: SUBSTRUCTURE
Structure table: PKOVACSUSER_520.PUBCHEM
Query: [$([#6]C=O),$(FC(F)(F)C=O),$([#6]C([#6])([#6])OC=O),$(O=COC1=CC=CC=C1),$(
C1=CC=C(C=C1)C(C1=CC=CC=C1)C1=CC=CC=C1),$([#6]C1=CC=CC=C1),$([#6]C1=CC=C(C=C1)S(
=O)=O),$(O=CCC1C2=CC=CC=C2C2=CC=CC=C12)]N1CCCC1
Screened: 1625241
Hits: 346290
Cache loading: 757741 ms
Cache size (this table / total): 1966.84 / 1966.84 MBytes
Total time: 479212 ms Screening: 1299 ms
Processing threads: 4
Current / peak / maximum searches per minute: 1 / 1 / Unlimited
Found 346290 hits in table PKOVACSUSER_520.PUBCHEM.
Search mode: FULL
Structure table: PKOVACSUSER_520.PUBCHEM
Query: [$([#6]C=O),$(FC(F)(F)C=O),$([#6]C([#6])([#6])OC=O),$(O=COC1=CC=CC=C1),$(
C1=CC=C(C=C1)C(C1=CC=CC=C1)C1=CC=CC=C1),$([#6]C1=CC=CC=C1),$([#6]C1=CC=C(C=C1)S(
=O)=O),$(O=CCC1C2=CC=CC=C2C2=CC=CC=C12)]N1CCCC1
Screened: 1625241
Hits: 9
Cache loading: 756777 ms
Cache size (this table / total): 1966.84 / 1966.84 MBytes
Total time: 659617 ms Screening: 1205 ms
Processing threads: 4
Current / peak / maximum searches per minute: 1 / 1 / Unlimited
Found 9 hits in table PKOVACSUSER_520.PUBCHEM.
In both cases, the initial screening left 1.6M structures to search, which is a lot...
As I said, we will improve screening for these type of queries. That will take searching time down significantly for full search time, and also hopefully somewhat for substructure search.
However, I am not sure if we will be able to squeeze in this development for the next major release (5.4). So I would say that it is only realistic to expect it for 5.5 in the first half of next year.
Best regards,
Szabolcs
Search mode: SUBSTRUCTURE
Structure table: PKOVACSUSER_520.PUBCHEM
Query: [$([#6]C=O),$(FC(F)(F)C=O),$([#6]C([#6])([#6])OC=O),$(O=COC1=CC=CC=C1),$(
C1=CC=C(C=C1)C(C1=CC=CC=C1)C1=CC=CC=C1),$([#6]C1=CC=CC=C1),$([#6]C1=CC=C(C=C1)S(
=O)=O),$(O=CCC1C2=CC=CC=C2C2=CC=CC=C12)]N1CCCC1
Screened: 1625241
Hits: 346290
Cache loading: 757741 ms
Cache size (this table / total): 1966.84 / 1966.84 MBytes
Total time: 479212 ms Screening: 1299 ms
Processing threads: 4
Current / peak / maximum searches per minute: 1 / 1 / Unlimited
Found 346290 hits in table PKOVACSUSER_520.PUBCHEM.
[scsepregi@prefect bin]$ ./jcsearch -Xmx4000M -vv -t:f -q ../../../workspace/db
Bug0003/R-query.mrv DB:PKOVACSUSER_520.PUBCHEM -f :TCD_ID >jcs_log_full.txt
Mon Jun 21 13:34:31 CEST 2010
Search mode: FULL
Structure table: PKOVACSUSER_520.PUBCHEM
Query: [$([#6]C=O),$(FC(F)(F)C=O),$([#6]C([#6])([#6])OC=O),$(O=COC1=CC=CC=C1),$(
C1=CC=C(C=C1)C(C1=CC=CC=C1)C1=CC=CC=C1),$([#6]C1=CC=CC=C1),$([#6]C1=CC=C(C=C1)S(
=O)=O),$(O=CCC1C2=CC=CC=C2C2=CC=CC=C12)]N1CCCC1
Screened: 1625241
Hits: 9
Cache loading: 756777 ms
Cache size (this table / total): 1966.84 / 1966.84 MBytes
Total time: 659617 ms Screening: 1205 ms
Processing threads: 4
Current / peak / maximum searches per minute: 1 / 1 / Unlimited
Found 9 hits in table PKOVACSUSER_520.PUBCHEM.
[scsepregi@prefect bin]$