jcman Markush structure upload problem

User 173254b396

09-11-2010 12:58:01

Hi,


We have been trying to create a Markush library in mysql 5.1.5. We create the Markush structure from the scaffold file(mrv) and fragments file(s)(mrv) with molconvert. The import (jcman 5.3.8) the created Markush structure to the database works if we have only several structures in the fragment file. However, our fragment files can go up to 10^5 order of magnitude. When we try to import a Markush which has 2-3000 structures the import crashes with java.langOutOfMemoryError no matter how much memory is allocated to java (1.5.0_06).


Any help is appreciated.


Regards, Péter


Error log:


/home/kemipv/software/jchem538/bin/jcman0 a all_markush CVGI0779c_markush.mrv --connect "name=NAME" --driver com.mysql.jdbc.Driver --dburl jdbc:mysql://semldx00021:3306/vl

Collecting file information ...

Done.

Importing structures from CVGI0779c_markush.mrv ...

CVGI0779c_markush.mrv CVGI0779c_markush.mrv.allError in molecule 1

java.util.concurrent.ExecutionException: chemaxon.util.concurrent.processors.WorkUnitException: java.lang.OutOfMemoryError: Java heap space

at chemaxon.util.concurrent.processors.WorkUnitData.getResult(Unknown Source)

at chemaxon.util.concurrent.processors.ScheduledWorkUnitData.getResult(Unknown Source)

at chemaxon.util.concurrent.processors.WorkUnitDataIterator.getNext(Unknown Source)

at chemaxon.jchem.db.ParallelStructTableUpdater.importFile(ParallelStructTableUpdater.java:369)

at chemaxon.jchem.db.FileToSQLHandler.importFile(FileToSQLHandler.java:129)

at chemaxon.jchem.db.Importer.importMols(Importer.java:469)

at chemaxon.jchem.Command.importFromFile(Command.java:1191)

at chemaxon.jchem.Command.run(Command.java:655)

at chemaxon.jchem.Command.main(Command.java:216)

Caused by: chemaxon.util.concurrent.processors.WorkUnitException: java.lang.OutOfMemoryError: Java heap space

at chemaxon.util.concurrent.processors.InputOrderedWorkUnitProcessor.process(Unknown Source)

at chemaxon.util.concurrent.processors.InputOrderedWorkUnitProcessor.processInput(Unknown Source)

at chemaxon.util.concurrent.processors.WorkUnitWorker.work0(Unknown Source)

at chemaxon.util.concurrent.processors.WorkUnitWorker.work(Unknown Source)

at chemaxon.util.concurrent.worker.Worker$1.call(Unknown Source)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)

at java.util.concurrent.FutureTask.run(FutureTask.java:123)

at chemaxon.util.concurrent.worker.Worker.run(Unknown Source)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)

at java.util.concurrent.FutureTask.run(FutureTask.java:123)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)

at java.lang.Thread.run(Thread.java:595)

Caused by: java.lang.OutOfMemoryError: Java heap space





 

ChemAxon fb166edcbd

09-11-2010 19:29:27

So if I got it right:


Step 1. You run molconvert to create Markush structure from scaffold and R-definition file:


molconvert mrv scaffold.mrv -R1 r1def.mrv -o markush.mrv 

Step 2. Then you run jcman a to add it to database:


jcman a table markush.mrv

 


Then you get out-of-memory error in Step 2. even if you increase java mem with the -Xmx option.


What is the size of the generated markush.mrv?


Can you load it into msketch with increased java memory?

User 173254b396

10-11-2010 08:17:27

Hi Nóra,


- You summirized the steps right.


-

User 173254b396

10-11-2010 09:35:51

Hi Nóra,


-  Sorry for the privious unfinished message.


- You summarized the steps correctly.


- The size of the smallest markush file I have tried and does not work is 2070781 byte. If I divide the fragment file into 2 pieces and create the 2 markush-s, these markush structures are imported without  problems.


- mscetch with HEAP_LIMIT=500 can display this 2Mbyte file but gives the same java.lang.OutOfMemoryError: Java heap space error with a 4098140 byte file. If I change HEAP_LIMT=2000 then the same 4Mbyte stops with error: Error in module Clean2D. Stack Trace shows: java.lang.OutOfMemoryError: Java heap.


Let me know if you need more information.


Regards, Péter

ChemAxon fb166edcbd

10-11-2010 13:24:56

Now it seems that this problem is related to the handling of large structures in Marvin in general rather than a problem in the specific algorithm used by molconvert when fusing fragments.


I will contact my collague who is responsible for the Marvin core.

ChemAxon fb166edcbd

11-11-2010 15:35:03

I could reproduce the problem with jcman and it seems that our inner representation is not prepared to handle such large structures. We can improve this by processing fragments separately in our code but implementing  that needs some time.