How to handle propertis in SDF file while importing.

User ab70427b17

27-10-2013 09:35:23

I am trying to import some structures from an SDF file. Each molecule in the file has many extra messages, such as purity, price etc, of a molecule. I wish to save these data to another table and bind them with the molecule by cd_id. And I also wish to filter out the duplicate molecules but the message of these molecules should be kept and saved in my table. But the chemaxon.jchem.db.Importer give me a black box on importing and I cannot make an customization. Is there any pattern on handle these issues?


Thank you very much.

ChemAxon abe887c64e

28-10-2013 10:36:15

Unfortunately, we don't fully provide the requested functionality.


Extra data of structures (i.e., purity, price,...) can be imported into JChem tables if you modify the tables by adding new columns before the import, and connecting fields in sdfile to the specified columns during the import process.


In case the JChem table has 'filter out duplicate structures' setting, then neither the structures nor extra data of the duplicates will be registered.


Take a look at the command line options of the import, please, they may help to identify the duplicates (even if the table doesn't have 'fiter out duplicate structures' setting).


Best regards,


Krisztina

ChemAxon 9c0afc9aaf

28-10-2013 11:12:02

Hi,


 


I think Importer is a too high level API for your specific purpose.


I suggest to read the Molecule objects yourself from the file, use MolImporter for this.


http://www.chemaxon.com/jchem/doc/dev/java/api/chemaxon/formats/MolImporter.html


You will find the attached data in the properties of the Molecule:


MolImporter molImporter = new MolImporter(...);

Molecule mol = molImporter.read();

int propCount = mol.getPropertyCount();

            for (int x = 0; x < propCount; x++){

                String propKey = mol.getPropertyKey(x);

String propValue = MPropHandler.convertToString(mol.properties(), propKey);

 


Then you can either do the search yourself:


http://www.chemaxon.com/jchem/doc/dev/search/index.html#duplicate


Or directly try the insert with  UpdateHandler with duplicate filtering (and realize it cannot insert if duplicate):


http://www.chemaxon.com/jchem/doc/dev/modify/index.html#modify_rows


http://www.chemaxon.com/jchem/doc/dev/java/api/chemaxon/jchem/db/UpdateHandler.html#setDuplicateFiltering(int)


Please note that the default for duplicate filtering is set at table creation.


UpdateHandler will also need the structure source as string - this is stored in the cd_structure column.


You can either conver to the desired storage format (our "mrv" format supports all features):


http://www.chemaxon.com/jchem/doc/dev/java/api/chemaxon/formats/MolExporter.html#exportToFormat(chemaxon.struc.Molecule, java.lang.String)


OR you may grab molecule records as String first (sections of the original source) and use these Strings for UpdateHandler later:


http://www.chemaxon.com/jchem/doc/dev/java/api/chemaxon/formats/MolImporter.html#readRecordAsText()


I hope this helps.


Szilard


 


 

User ab70427b17

29-10-2013 01:57:12

Dear kvajda, it is a one to many relationship so i can not add them to structure table directly. Thank you.


Dear Szilard, that is what I need. Thank you very much. :)