Split a database of molecules into N smaller sets

User 2b68687bb8

30-12-2009 10:37:44


Hi,


I got a big set of molecules in mol2 format.
I would like to be able to split this database into an arbitrary set of
N smaller databases. What is the best approach for this with chemaxon?


If not, I can also convert mol2 to another more convenient format


Thaks


ChemAxon d76e6e95eb

30-12-2009 11:53:46

If you can write some lines of code in Java, you can have full control which molecule you copy to which file. Please see the MolImporter and MolExporter classs of the ChemAxon toolkit.


It is possible to create smaller files using the mview command line tool as well. You can specify the start indeex and the number of molecules opened as command line parameters, for example:



mview in.sdf -s 501 -n 100


It will display 100 molecules starting at the 501st one of your SDfile. Then you can save the selected set with the File/Save menu.


If you would like to create the smaller pieces by structural criterias or by additiona fields, you can examine how to use the jcsearch command line application or Instant JChem.

User 2b68687bb8

30-12-2009 14:40:05

hi,


 


thanks for the answer


 


how can one afterwards save all the different individual set files?


 

ChemAxon d76e6e95eb

31-12-2009 11:03:42

I think, I have described how to save the selected set of molecules in
mview, and the links I included for the other tools lead to examples, that you might examine. If they are not clear, or I misunderstood something, please explain the problem by a specific example.