How much memory to allocate to Tomcat

User f698d0529d

29-09-2005 14:15:04

Hi


This is a question which I started to ask before, but turned into a discussion on fingerprints, which I am still following up, and has been useful.





Just to clarify, about the memory requirements, the question was about the parameter to put in the catalina.sh file. Please bear with me while I explain





export JAVA_OPTS=${JAVA_OPTS}' -Xmx800m'





the line above is my catalina.sh setting at the moment.





I worked out that for all my JChem indexes, 656M of cache memory was required (using the exact method of calculation based on fingerprint size and average smiles length). On top of this, 64M are set in the JChemstreams web application for "reserved for temporary data".





So, in total, this should be 720M needed. The reason I had to increase it to 800M was because otherwise, I was still seeing messages in the catalina.out file that tables were being dropped from the cache due to lack of memory. I don't understand why this was happening. So I thought the best thing to do was leave the 64M alone, and keep increasing the catalina setting until the messages disappeared.





But I wonder now if increasing the 64M value and the total catalina amount still further would improve the performance. We now have plenty of RAM available on the server, but I don't want to waste memory either. For example, I wonder what happens if the total catalina memory is greater than the cache requirement and the temporary computational requirement. In that case, would some memory end up being reserved, but not used?





Do you have any suggestions as to how to set the value to the "optimum" value?

ChemAxon aa7c50abf8

29-09-2005 15:07:59

Mark,
Quote:
For example, I wonder what happens if the total catalina memory is greater than the cache requirement and the temporary computational requirement. In that case, would some memory end up being reserved, but not used?
The Java runtime will essentially reserve only the amount of memory which the application actually needs. The -Xmx switch basically tells the Java runtime that it should not allocate memory above this limit (but throw an OutOfMemoryError to the application). The Java runtime will not pre-allocate the memory specified by this switch. (For memory pre-allocation, there is another switch [-Xms] which we do not use.) In this sense a catalina memory setting exceeding the actual memory needs will not result in memory over-allocation.
Quote:
Do you have any suggestions as to how to set the value to the "optimum" value?
The amount of memory specified as the temporary "computational" requirement will not be used by the JChem structure cache. Since the actual temporary memory need is difficult to estimate (and there is some risk of eventually underestimating it), it is a good idea to specify the catalina memory setting so that there is some comfortable margin (as much as 100-200MB) left for both structure cache and computation. As stated above, there is no danger of over-allocating system memory with a too generous setting.





Peter

User f698d0529d

29-09-2005 16:07:51

Thank you for that. It was helpful. You mentioned before about making changes to the default 64MB setting in the JChemstreams web application. What benefits / effects would you expect to see from changing this value? Presumably it is this 64MB which is used for JChem computation on the Tomcat side of things.

ChemAxon aa7c50abf8

30-09-2005 10:24:39

Mark,





It is not closely related to the subject, but I think it is useful to mention (to avoid potential misunderstandings) that the limit specified by the -Xmx switch applies to the heap portion of the entire memory taken up by the Java runtime. Due to the way the Java runtime is architected, the memory need of JChem Streams (and that of the underlying JChem Base components) is almost entirely met by increasing the Java runtime's heap memory. The rest of the entire memory (i.e. apart from the heap)taken up by the Java runtime is about 30-40MB and is not likely to change significantly.





In fewer words: we're concerned with the heap memory (limited by -Xmx), but in addition to the heap there is a more or less constant overhead of 30-40 MB (probably less, but let us be conservative).





On the Tomcat side of the JChem Cartridge, the main consumers of the heap memory can be divided in two categories: (a) the structure cache and (b) the Java objects used for computing and temporary storage. The amount of the memory required by the structure cache can be fairly accurately computed. The rest (i.e. type b) is difficult to compute/predict.





The memory required for temporary storage is mainly used to store the result sets of the queries. The big bulk of "computing"-related Java objects is associated with the second phase of the substructure search. The important thing is that the amount of memory consumed by operations of type (b) depends on the number of concurrent searches and the characteristics of the individual searches themselves. (Even if knew the exact number of concurrent searches at a given point in time, the target structures in the tables involved and the queries for each of the concurrent searches, it would be very difficult to predict/calculate the memory consumption of type (b)).





As the structure cache for a given table is being built up (heap memory consumer (a)), the maximum available memory is always taken account of. If caching the structures of a table would require more memory than the amount specified with -Xmx less the amount specified for the estimated temporary memory need (by default 64MB), the table being loaded into cache would not be cached. In contrast, the heap memory consumers of type (b) are not aware of the available memory -- the nature of the functions they carry out is such that they could not adjust their behavior to a limited amount of memory anyway (or it would be extremely complex for them to do so). The memory required by type (b) must simply be there.





The temporary storage (for type (b)) estimation (specified on the JChem Cartridge administration WEB-page) is taken account of during structure table caching. Its purpose is that structure caches do not take up so much memory as to make the execution of the search impossible (due to insufficient memory). It follows from this that as long as you specify a value for -Xmx large enough, the temporary storage estimation will not effectively limit the number of cached structure tables. Since you have plenty of memory (and -Xmx will not cause over-allocation), I suggest to set -Xmx to something like 2000m to start with.





Peter

User f698d0529d

30-09-2005 14:54:04

Peter


I have increased the xmx setting from 800M to 2000M and the temporary jchemstreams memory from 64M to 128M. It will be in the long term before I can tell if this has made any difference, and even then it may be unclear.


Thanks for your help


Mark