Error opening SDF file with MarvinView or MolImporter

User 8cbba88c0e

30-06-2011 09:14:55

Hi,


when I try to open the attached file with MarvinView I get the following stack trace -


chemaxon.formats.MolFormatException: Error parsing SMILES string '2|LigandData|sdf|2|dock1' at character 1 ('2')
 at chemaxon.marvin.io.formats.smiles.SmilesImport.readMol0(Unknown Source)
 at chemaxon.marvin.io.formats.smiles.SmilesImport.readMol(Unknown Source)
 at chemaxon.marvin.io.formats.smiles.CxsmilesImport.readMol(Unknown Source)
 at chemaxon.marvin.io.formats.smiles.SmilesImport.readMol(Unknown Source)
 at chemaxon.marvin.io.MRecordImporter.readStructure(Unknown Source)
 at chemaxon.marvin.io.MRecordImporter.readDoc(Unknown Source)
 at chemaxon.marvin.io.MRecordImporter.readDoc(Unknown Source)
 at chemaxon.marvin.io.MRecordImporter.readDoc0(Unknown Source)
 at chemaxon.marvin.io.MRecordImporter.readDoc(Unknown Source)
 at chemaxon.formats.MolImporter.readDoc(Unknown Source)
 at chemaxon.formats.MolImporter.nextDoc(Unknown Source)
 at chemaxon.marvin.view.MDocStorage.readDoc(Unknown Source)
 at chemaxon.marvin.view.MDocStorage.tryToExtend(Unknown Source)
 at chemaxon.marvin.view.MDocStorage.getMainDoc(Unknown Source)
 at chemaxon.marvin.view.MDocStorage.getMainDoc(Unknown Source)
 at chemaxon.marvin.view.swing.TableSupport$11.run(Unknown Source)
 at chemaxon.marvin.view.swing.TableSupport.setDocSource(Unknown Source)
 at chemaxon.marvin.view.swing.TableSupport.access$600(Unknown Source)
 at chemaxon.marvin.view.swing.TableSupport$UpdateTask.run0(Unknown Source)
 at chemaxon.marvin.view.swing.TableSupport$UpdateTask.run(Unknown Source)
 at chemaxon.marvin.view.SequentialScheduler$CmpRunnable.run(Unknown Source)
 at chemaxon.marvin.view.SequentialScheduler$1.run(Unknown Source)


If I try to read the sdf file with


File results = new File(m_Output.getStringValue());
FileInputStream fileInputStream = new FileInputStream(results);
MolImporter molImporter = new MolImporter(fileInputStream);
while ((molecule = molImporter.read()) != null){
}


I get the following error -



Error parsing SMILES string '2|LigandData|sdf|2|dock1' at character 1 ('2')


The SDF file looks to be a standard file and if i delete the first line MarvinView will read it ok and I can read it using MolImporter. The file is generated by another application on a Linux system and I am working on it from Windows (after running unix2dos on it , but get the same result even if I don't run unix2dos).


Can you suggest why I am getting these errors (and a fix if possible)?


Thanks


Best regards


Bob


ChemAxon 5433b8e56b

05-07-2011 14:37:47

Hi Bob,


this is because of the pipe(|) characters, the SMILES recognizer recognizes the file, and Marvin try to import it as a smiles. You can use the MolImporter(String, String) constructor, and add "sdf" as a second parameter, this will explicitly tell marvin to import the file as an sdf.


In Marvin 5.6 the recognition will be imporved, but this is a hard one, and the solution is also not perfect, but much more effective (file format recognition is a hard thing to make it perfect, and has contionious improvments in marvin, as we discover such issues), there will be the following rule in 5.6:
If the first line contains two pipe character, and the first one has a white space before, and the second one has a white space, or an eof after then the file is considered to be in cxsmiles format.


Current marvin versions has the following rule: if the first line contains a pipe character, and not starting with the "Vj" character sequence, then the file is considered to be in cxsmiles format. This rule causing the problem.


Regards,
Istvan

User 8cbba88c0e

05-07-2011 15:53:46

Thanks for your help and explanation Istvan.


I will try MolImporter(String, String) and look forward to the improvements in Marvin 5.6.


Cheers


Bob