SDF File Import

User 34fa07fa99

05-01-2007 01:04:46

I am trying to import an SDF file using the standard procedures like:





Code:



   Importer i = new Importer();


   ConnectionHandler ch = getConnectionHandler();


   ch.setPropertyTable(getTempPropertiesTableName());


   i.setConnectionHandler(ch);


   i.setInput(getSdfFile());


   i.setTableName(getTempTableName());


   String fieldMapping = formatFieldMapping();


   if (fieldMappingForJChem != null) {


      log.debug("Mapping : " + fieldMapping);


      i.setConnections(fieldMapping);


   }


   i.importMols();








The SDF file contains a single molecule but somehow, the importer creates a lot of empty molecules. I suspect something is wrong with the SDF file that I got from another piece of software but I am not sure what it is. At some later point my code also looks for duplicates and throws an exception in the search:
Quote:






java.io.IOException: Unknown amino acid 9


at chemaxon.marvin.modules.PeptideReader.findAminoAcids(PeptideReader.java:278)


at chemaxon.marvin.modules.PeptideReader.convert(PeptideReader.java:115)


at chemaxon.marvin.modules.PeptideImport.readMol(PeptideImport.java:71)


at chemaxon.formats.MolImporter.readMol(MolImporter.java:763)


at chemaxon.formats.MolImporter.read(MolImporter.java:595)


at chemaxon.formats.MolImporter.read(MolImporter.java:561)


at chemaxon.util.MolHandler.importMol(MolHandler.java:683)


at chemaxon.util.MolHandler.setMolecule(MolHandler.java:159)


at chemaxon.jchem.db.JChemSearch.readQuery(JChemSearch.java:2960)


at chemaxon.jchem.db.JChemSearch.search(JChemSearch.java:2216)


at chemaxon.jchem.db.JChemSearch.setRunning(JChemSearch.java:2105)


at chemaxon.jchem.db.JChemSearch.run(JChemSearch.java:2125)


Any ideas ?


Thx

ChemAxon 9c0afc9aaf

05-01-2007 09:04:37

Hi,





Which version of JChem are you using ?





Szilard

ChemAxon 9c0afc9aaf

05-01-2007 09:30:41

Hi,





Meanwhile I took a look at the source.


The file has errors in it.





1.: The "M END" line is missing from the end of the CTAB part


2.: There were some extra lines at the end of the file.





You can find further information in this document:





http://www.mdl.com/downloads/public/ctfile/ctfile.pdf





I have also attached the fixed file.





Best regards,





Szilard

ChemAxon 9c0afc9aaf

05-01-2007 14:10:02

PS:





If you have a lot of such sources, MolConverter (molconvert) seems to accept this errors better, so with an SDF -> SDF conversion you can correct these.





http://www.chemaxon.com/marvin/doc/user/molconvert.html





Best regards,





Szilard

User 34fa07fa99

05-01-2007 14:50:03

Szilard wrote:
Hi,





Which version of JChem are you using ?





Szilard
I was using 3.2

User 34fa07fa99

05-01-2007 15:57:05

Thank you for the help, it works if I fix it. I will have to make my program call molfile.bat because it will be difficult for my client to do it himself.





Thanks, you very helpful as always.

User 34fa07fa99

05-01-2007 19:00:05

I succeeded in calling the molconvert.bat from my java program using the following code that I am providing here because it took me sometime to make it work:


Code:



File cmd = new File(getServletConfig().getInitParameter("JChemPath"),


                            "molconvert.bat");


Process p = Runtime.getRuntime().exec(new String[] {


                       "cmd", "/c", "start", cmd.getAbsolutePath(),


                       "sdf", sdf.getAbsolutePath(), "-o", fixedSdf.getAbsolutePath()


}, null, cmd.getParentFile());








Now the next problem that is somewhat unrelated is that my exact search doesn't find the imported molecules so I can import them as many times as I want. Even using the Marvin applet, the exact search does not find the molecule while the substructure search seems to work fine. What could be the reason for this?





Thank you in advance.

User 34fa07fa99

05-01-2007 19:10:15

Actually it does work for simple molecules but not for the one in the above sdf file.

User 34fa07fa99

05-01-2007 19:28:59

Just an addendum for the above code fixed in case anyone needs it:





Code:



File cmd = new File(getServletConfig().getInitParameter("JChemPath"),


                    "molconvert");


boolean dos = File.separatorChar == '\\';


Runtime.getRuntime().exec((dos ? "cmd /c " : "")


                  + cmd.getAbsolutePath() +" sdf "


                  + sdf.getAbsolutePath() + " -o "


                  + fixedSdf.getAbsolutePath(), null, cmd.getParentFile());








The cmd is the path to the Jchem bin directory and sdf, fixedSdf are the two sdf files before and after conversion.

ChemAxon 9c0afc9aaf

06-01-2007 10:59:29

Quote:
Now the next problem that is somewhat unrelated is that my exact search doesn't find the imported molecules so I can import them as many times as I want. Even using the Marvin applet, the exact search does not find the molecule while the substructure search seems to work fine. What could be the reason for this?
I have tested with "fixed.sdf" I have attached and it finds itself fine with both "EXACT" and "PERFECT" search.


Could you provide more details on how do you run the search ?





By the way we recommend the "PERFECT" search mode for filtering duplicates.





Best regards,





Szilard

User 34fa07fa99

09-01-2007 11:13:33

Here is my code:


Code:
   ConnectionHandler ch = new ConnectionHandler();


   ch.setConnection(getConnection());


   ch.setPropertyTable("jchemproperties");


   JChemSearch searcher = new JChemSearch();


   searcher.setConnectionHandler(ch);


   searcher.setSearchType(SearchConstants.PERFECT);


   searcher.setStructureTable("structures");


   //searcher.setInfoToStdError(true);


   searcher.setWaitingForResult(true);


   searcher.setQueryStructure(structure);


   searcher.run();


   logger.debug(searcher.getResultCount());





Thanks again

ChemAxon 9c0afc9aaf

09-01-2007 12:07:15

Hi,





The structure (fixed.sdf) finds itself fine for me again, even using your code.








You can make sure you are using the reported version (3.2) by printing the value of this constant:





Code:
chemaxon.jchem.VersionInfo.JCHEM_VERSION






Also make sure no other ChemAxon jar (e.g. MarvinBeans.jar) is present in the classpath, only a single jchem.jar.





Also make sure you use autocommit (default) or explicitly commit after insertion if using an other connection for that.





Best regards,





Szilard

User 34fa07fa99

09-01-2007 12:59:51

The problem only happens on my development environment so I guess I haven't updated the structure tables properly or something so its probably my mistake, sorry.





Thanks again.