Search performance - ChemAxon Forum Archive

User 773d472e7f

30-04-2013 13:45:25

I build a test database with 1.2 million structures as local database. The good news is that one can build such a large database. However, it seems to me the SSS is with 4 minutes about as fast (or slow) as a text string search. It seems the structures are not indexed particularly. Or, do I need to make adjustments.

Alex

ChemAxon 2bdd02d1e5

30-04-2013 14:23:02

Hi Alex,

There should be no need to make any adjustments. So it seems that the time is about right. I'm affraid that it cannot be speeded up easily. We have some customers using MySQL with several millon structures and the search takes several minutes to run...

We provide the solution based on Oracle Cartridge which is particularly suitable for large databases. The difference in performance against the local database is enormous. Please refer to JChem Cartridge product if you are interested.

Filip

ChemAxon 2bdd02d1e5

30-04-2013 14:56:30

Sorry, I was not precise about the search times. Only the first search could take minutes and next searches should be finished within seconds, when the structures are in cache.

Filip

User 773d472e7f

30-04-2013 16:34:44

Thanks Filip

Yes - only the first search is not so fast!

I am working on a 64 bit Window 8 machine (modified to Windows 7 look and feel) with 8GB Ram. Would it make a difference to install Java 64 bit? Presently the 32 bit version is installed.

We are testing speed also on a 64bit Linux server with MySQL as database.

Alex

ChemAxon 2bdd02d1e5

30-04-2013 17:18:11

I have no benchmarks available wrt your question right now.

Basically I would not expect a big difference in the first search. It's limited by the Derby Database engine (for the local database) and its IO operations.

What could help in case of really big database is increasing Java heap memory available to Instant JChem. But this counts only for other searches from cache (so the cache can store more structures). Java heap memory size for 32bit Java is limited to ~1300MB whereas for 64bit Java you can set up more.

Thanks
Filip

User f67d4188b6

06-05-2013 07:57:19

Searchtimes are roughly similar with a remote MySql database running on a linux server with bucketloads of ram.

On a 9.8 milion compounds table the initial search takes around 40 minutes in our setup. That is about the time needed to build a new index, so i guess that is what hapens (probably clientside). Consecutive searches use the same calulated index (or whatever the term is) and are therefore superfast.
If you then change something essential to the search criteria (so not adding more restrictions, but for example change the scaffold structure in your search) this whole process will take place all over again.