Slow table regeneration

User e05b1833aa

13-01-2011 09:44:53

Hi,


I upgraded to IJC 5.4, which requires all tables to be regenerated. This seems to go very slow: for a 450K compound  it so far has taken ~18h to regenerate 70% of the database. In contrast importing the structures into the database when creating it took less than 2 h. Cancelling regeneration also takes forever, so I will let it finish, but also have a 5 miljon compound database that needs to be regenerated.


Is this normal? I have a very fast mysql-based server, substructure searching in the 5 miljon cpd database takes just tens of seconds.

ChemAxon fa971619eb

13-01-2011 09:59:08

Regenerating a large table will take some time. The fingerprints for every structure need to be regenerated and written to the database. Some things that might be useful to speed this up:


1. Perform the regeneration on the same computer as the database to avoid the need to network traffic. If this is not possible then definitely make sure you have a fast connection to the database.


2. Delete chemical terms columns first. These will also need to regenerated, and if some of the calculations are slow (e.g. logD, pKa) then this can considerably slow down the regeneration process. Once regenerated these calcuations can be added back at a time of your convenience.


Note that we do try to avoid the need for regeneration where possible, but where major changes to the JChem version is involved it is often unavoidable.


Also note that JChem now has a "pre-regeneration" feature that allows most of the work to be performed "offline" and then only the final update needs to be performed online. This will not speed up the process, but will avoid the need for your database to be offline for a long period. This feature is only available in JChemManager, not Instant JChem. You should use JChemManager to do the regeneration and then open the database in IJC when complete. See here for details:
http://www.chemaxon.com/jchem/doc/admin/#precalc


Tim