I'm trying tu run some queries on a table containing ~9 million compouns. The problem is than when I try to build a SMARTS query the sketch window often pops out blank. Not always but most of the times. This behaviour also appears with other dialogs, annoyingly often they pop out sized huge but blank. I kind of suspect this is a bug because I don't see how this is supposed to help the user.
Another problem with the same query is that the first SMARTS pattern is processed fine, but the second remains hanging. The first one took about 10-15 minutes, on the second one i lost patience after two hours. I'm using the 126.96.36.199 version if IJC, but I encountered the same problems on 2.4.3, before updating.
I tried to overcome these problems by pointing IJC to my system java instead of the bundeled one, but judging by the output on startup it still uses it's own. I think I'm missing something, how can this be done?
Does anybody know how to make IJC work?
I have allocated 2Gb of memory and the performance is ok. The first query takes 10-15 minutes. I'd assume the next one would not take longer but it can run for hours and not produce anything useful. Also I noticed the same thing when trying to load a list - after couple of hours nothing had happened.
The query of other parameters I have calculated seems fine and the times are reasonable.
About stopping the current action by clicking the "Stop" button, well I would but the dialogs are blank and the whole IJC window is unresponsive. The cancel button down right corner also doesn't work. I don't think this is the desired behaviour.
I'm using IJC on 64-bit Ubuntu Linux, but since IJC uses it's own java it should not be a problem, should it? My co-worker is using IJC also on 64-bit system (windows though) and he's not having these problems.
I've tried something similar on a 64 bit Linux system (Fedora Core 8 in my case) and found that things work OK as long as you set enough memory to IJC.
I did this:
1. Import 4 million structures into a local database.
2. Set max memory (the Xmx setting) to 800MB (500MB was not sufficient).
3. Ran searches.
The first structure search took about 30 seconds to complete as the structure cache was being loaded (this is what needs the large amount of memory).
Subsequent searches complete in about a second.
If you also have search terms for non-structure fields then the times are much slower as you might expect as the fields are not indexed. However, adding indexes seems to make matters worse, not better. This is probably a tuning problem with the Derby database. We will investigate this.
I also compared the Java version that is installed with IJC (32 bit) with a 64 bit version of Java and did not see any significant differences.
I hope this information helps you sort out the problems.
Here are the results of some further research.
The performance on searches on non-structure fields can be speeded up by adding appropriate indexes to the database. The following is based on the assumption that you are using a local database (please say if not) but similar things probably apply to Oracle and MySQL.
This is quite a complex subject, and adding an index can make things much worse in some cases, so this should be tested carefully.
But by judicious use of indexes I managed to be able to execute combined structure and data queries on a database of 4 million structures in just a couple of seconds on a 64 bit linux system. Of course each query and database will be different, and some might be much worse.
Rather than describe this in detail here I added extra information to the on-line documentation. Look here for starters:
Let me know if this helps (but make sure you solve the memory issue first as unless you have sufficient memory allocated then you will never see good performance).