OutOfMemoryError during structure search

User b60e1d3756

14-05-2010 13:29:23

Hello,


I try to perform search of the identical structures within the table using java API.


When searching for particular entries it stops and then throws the exception:


Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at chemaxon.calculations.Tautomerization.calcDelocPath(Tautomerization.java:12098)
        at chemaxon.calculations.Tautomerization.createDelocIsland(Tautomerization.java:11514)
        at chemaxon.calculations.Tautomerization.createSimpleTautomers(Tautomerization.java:5525)
        at chemaxon.calculations.Tautomerization.calculateDACouples(Tautomerization.java:8397)
        at chemaxon.calculations.Tautomerization.createDACouples(Tautomerization.java:8365)
        at chemaxon.marvin.calculations.TautomerizationPlugin.run(TautomerizationPlugin.java:655)
        at chemaxon.enumeration.TautomerEnumerator.doRun(TautomerEnumerator.java:231)
        at chemaxon.enumeration.TautomerEnumerator.hasMoreElements0(TautomerEnumerator.java:209)
        at chemaxon.enumeration.MolEnumerator.hasMoreElements(MolEnumerator.java:254)
        at chemaxon.enumeration.CombinationEnumerator.addEnumerators(CombinationEnumerator.java:152)
        at chemaxon.enumeration.CombinationEnumerator.nextElement0(CombinationEnumerator.java:203)
        at chemaxon.enumeration.MolEnumerator.nextElement(MolEnumerator.java:278)
        at chemaxon.jchem.db.JChemSearch.enumeratedSearch(JChemSearch.java:5224)
        at chemaxon.jchem.db.JChemSearch.search1(JChemSearch.java:2398)
...


Why does it happen that search slows down? How can that be changed?


If there is a question about what I am doing, I wrote here:


https://www.chemaxon.com/forum/ftopic6026.html


With the best regards,


Albina

User c1ce6b3d19

17-05-2010 08:57:25



Albina,


This is from our FAQ regarding out of memory errors.  If you have further specific questions, please post them.  


 




 


Java applications in general:
In the case of most Java Virtual Machines, the default setting of maximum heap size is 64MB. One can increase the maximum heap size of applications running under Sun's environment by setting the -Xmx parameter. General example for allowing 128 MBytes for an application: 
java -Xmx128m my.Application


JChem applications:
In the case of the JChem application startup files (Windows Batch Files and Unix Shell Scripts) an application-specific value is specified in the startup file, which can be easily edited. Please click here for further information.


Web applications:
If your problem occurs in Tomcat, please see the Tomcat configuration page
If you use a different servlet server, then please consult the documentation of the software for details.


Please also see the "Memory" section of hardware requirements for more information.


 




 


Jon


 

ChemAxon 9c0afc9aaf

17-05-2010 09:37:12

Hi Albina,


 


Looks like you are performing a Tautomer search when looking for duplicate structures in the database using method #2 "Tautomer search optiion"


http://www.chemaxon.com/jchem/doc/dev/dbconcepts/index.html#tautomers


Using method #1 "Tautomer duplicate table or JChem index option" is more efficient in this case, since avoids enumerating a possibly large number of tautomers of the query as done in #2, which means a lot of searches on the table for a single query.


The option in method #1 can also be changed for existing table, requires re-calculation computed columns (regeneration):


http://www.chemaxon.com/jchem/doc/admin/index.html#table_settings


 


Szilard 

User b60e1d3756

17-05-2010 11:34:55

Hello Szilard and Jon,


the problem was indead in tautomers. Thank you for the help!


I changed table property (tautomerDuplicateChecking) and DUPLICATE search started to work.


But I need the options from the FULL search: ignoring or considering charges, isoptops, radicals. It is supposed that the users will define this propeties, so It should be possible to vary these parameters in my program.


What would you recommend me in this case:


1. To use FULL search. Is it possible to speed up the tautomer search using FULL search?


2. To change these properties in the molecules according to the predefined parameters (like discard information on charges or isotops, if user set IGNORE) before the search? It should be possible with Standardizer, i suppose.


With the kindest wishes,


Albina

ChemAxon a3d59b832c

17-05-2010 13:39:55

Hi Albina,


It is possible to ignore charges, radicals, isotopes etc as search option with duplicate search as well.


However, I am not sure if it all these make sense to combine with the tautomer search option.


(Tautomers means that the molecular formula is the same, and H atoms can move about in tautomeric regions within the molecule. Any change in charge or radical will definitely mean that the molecular formula also change, and they also influence the tautomerism of functional groups in question...)


Standardization is also a possible solution for these, but then you will not be able to change settings search by search.


 


Full structure search with tautomer search option enumerates the tautomers of the query, so it may be slow. To speed it up, you could use Standardizer with "tautomerize" action (it calculates the canonical tautomer). - In this case you will not need the tautomer search option. However, the "tautomerize" action is not advised in a standardization attached to a JChem table, because it may break substructure search. (The canonicalization algorithm is based on full molecules, not on substructures.)


 


I hope this helps.


 


Best regards,


Szabolcs

User b60e1d3756

18-05-2010 13:01:08

Hello Szabolcs,


Thank you very much for the help. I am exploring abilities of DUPLICATE search. Does it allow to ignore sterechemistry as well?


When I set searchOptions.setStereoSearchType( JChemSearchOptions.STEREO_IGNORE) I found out that compounds in attachment are not considered as dublicates during the search. I set searchOptions.setImplicitHMatching(SearchConstants.IMPLICIT_H_MATCHING_ENABLED) .


Should it be like that? If I want those compounds to be retrieved as dublicates, what should I change?


With the kindest wishes,


Albina

ChemAxon a3d59b832c

18-05-2010 21:06:49

Hi,


Yes, these settings are valid, and do what you want to achieve.


However, we cannot really make out the molecules on the pictures. Can you attach them in higher resolution or as molecule sources?


Thanks,


Szabolcs



User b60e1d3756

19-05-2010 08:05:07

Hello,


Here are these molecules.

User b60e1d3756

19-05-2010 13:54:31

Hello,


I think this file will be better.


And I want to show the code as well. Maybe something is wrong...


 searcher.setConnectionHandler(ch);
            searcher.setQueryStructure(mol);
           
            searchOptions.setSearchType(SearchConstants.DUPLICATE);
            stereo = JChemSearchOptions.STEREO_IGNORE;
            searchOptions.setStereoSearchType(stereo);
            searchOptions.setChargeMatching(charge);
            searchOptions.setIsotopeMatching(isotop);
            searchOptions.setRadicalMatching(radical);
            searchOptions.setTautomerSearch(tautomer); // TODO check
            searchOptions.setVagueBondLevel(vaguebond);


            searchOptions.setImplicitHMatching(SearchConstants.IMPLICIT_H_MATCHING_ENABLED); // TODO or just remove


            searcher.setSearchOptions(searchOptions);
            searcher.setStructureTable(tblName);
            searcher.setRunMode(JChemSearch.RUN_MODE_SYNCH_COMPLETE);


            searcher.run();

User b60e1d3756

19-05-2010 13:55:48

this file

ChemAxon a3d59b832c

20-05-2010 14:45:01

Hi,


Thanks for the structures and the code example.


Unfortunately, with the "duplicate search uses tautomers" table option,stereo = off search option is not available.


(Here, we had to choose how stereochemistry and tautomerism interact because of the artifacts put into the database table. See more details under "Stereo notes" in the documentation of the tautomer duplicate table option:


http://www.chemaxon.com/jchem/doc/dev/dbconcepts/index.html#tautomers


)


We will think about how to allow stereo = off searches in this case.


 


In the meantime, you can use another method - for example, as you mentioned, you can use Standardizer to remove stereo information on your structures. 


Best regards,


Szabolcs

User b60e1d3756

21-05-2010 12:39:08

Thank you for the help! I will think!


with the best reagards,


Albina