program dies on an bad smiles

User 55ffa2f197

04-10-2011 17:09:54

Following smiles caused my simple code to die with following error message:


Exception in thread "main" java.util.UnknownFormatConversionException: Conversion = '1'
    at java.util.Formatter.checkText(Formatter.java:2547)
    at java.util.Formatter.parse(Formatter.java:2533)
    at java.util.Formatter.format(Formatter.java:2469)
    at java.io.PrintStream.format(PrintStream.java:970)
    at java.io.PrintStream.printf(PrintStream.java:871)


 


Here is the smiles:


O=C1C=CC=C2C1=NC1=c3ccc4c5cc(=O)c6c[nH]cc7ccc(c8ccc(C9=Nc%10ccccc%10[N]219)c3c48)c5c67


though Marvin view would render it. How do I prevent such bad egg to crash the program? Yes, meanwhile send bad smiles to stderr.


I am using following block to get mol, and do some cleaning up on my molecules ....


.....


try {
                            Molecule m = MolImporter.importMol(st.nextToken(), "smiles");
                            try {
                                plugin.setMolecule(m);
                                plugin.run();
                            } catch (PluginException e) {
                                e.printStackTrace();
                            }
                             
                            //get props I might need
                            
                            ringCount = plugin.getRingCount();
                            rotatableBondCount = plugin.getRotatableBondCount();
                            smallestRingSize = plugin.getSmallestRingSize();

                            
                            //further cleaning up of the molecules
                           
                            int atomcount = m.getAtomCount();
                            for (int i = 0; i < atomcount; i++) {
                                MolAtom atom = m.getAtom(i);
                                if (atom.getAtno() == MolAtom.ANY) {
                                    atom.setAtno(1);
                                } else {
                                    
                                }
                            }
                            m.implicitizeHydrogens( atomcount );
                            tsmiles.add( m.toFormat( "smiles" ) );
                           
                        
                    } catch (MolFormatException e) {
                            e.printStackTrace();
                }


Thanks


Dong

ChemAxon 9c0afc9aaf

04-10-2011 20:28:39

The problem has been handled offline, printf recognized some SMILES characters as formatting, println works fine.

User 55ffa2f197

05-10-2011 11:34:57










Szilard wrote:

The problem has been handled offline, printf recognized some SMILES characters as formatting, println works fine.



right, that is a false alarm, but here is the real crash in the same program caused by this smiles:


*C(CN(*)*)CN1c2ccccc2-[#5&a]c-c2ccccc12.


The error message is shown as following. My question remains how do I ignore such bad smiles, and move on to process next one. I know in the ~8 million smiles I am processing lot of them are problematic. Following is the relavant code block:


try {
                            Molecule m = MolImporter.importMol(st.nextToken(), "smiles");
                           
                            try {
                                plugin.setMolecule(m);
                                plugin.run();
                            } catch (PluginException e) {
                                e.printStackTrace();
                            }
                             
                            //get props I might need
                            
                            ringCount = plugin.getRingCount();
                            rotatableBondCount = plugin.getRotatableBondCount();
                            smallestRingSize = plugin.getSmallestRingSize();

                            
                            //further cleaning up of the molecules
                           
                            int atomcount = m.getAtomCount();
                            for (int i = 0; i < atomcount; i++) {
                                MolAtom atom = m.getAtom(i);
                                if (atom.getAtno() == MolAtom.ANY) {
                                    atom.setAtno(1);
                                } else {
                                    
                                }
                            }
                            m.implicitizeHydrogens( atomcount );
                            tsmiles.add( m.toFormat( "smiles" ) );
                           
                        
                    } catch (MolFormatException e) {
                            e.printStackTrace();
                }


 


Error message


Exception in thread "main" java.lang.IllegalArgumentException: chemaxon.marvin.io.MolExportException: The following atom cannot be aromatic according to the SMILES definition: B
    at chemaxon.struc.Molecule.toFormat(Molecule.java:1421)
    at CleanIBMPatentSmiles.readLines(CleanIBMPatentSmiles.java:72)
    at readIBM.main(readIBM.java:10)
Caused by: chemaxon.marvin.io.MolExportException: The following atom cannot be aromatic according to the SMILES definition: B
    at chemaxon.marvin.io.formats.smiles.SmilesExport.generateSmilesString(SmilesExport.java:2435)
    at chemaxon.marvin.io.formats.smiles.SmilesExport.singleMolToSMILES(SmilesExport.java:1061)
    at chemaxon.marvin.io.formats.smiles.SmilesExport.toSMILES(SmilesExport.java:905)
    at chemaxon.marvin.io.formats.smiles.SmilesExport.convert(SmilesExport.java:727)
    at chemaxon.struc.Molecule.exportToObject(Molecule.java:1627)
    at chemaxon.struc.Molecule.exportToObject(Molecule.java:1592)
    at chemaxon.struc.Molecule.exportToFormat(Molecule.java:1441)
    at chemaxon.struc.Molecule.toFormat(Molecule.java:1419)
    ... 2 more

ChemAxon 25dcd765a3

06-10-2011 06:58:50

Boron cannot be aromatic according to the SMILES definition.


#5&a

is aromatic boron.


You can ignore these (bad) SMILES by catching the chemaxon.marvin.io.MolExportException caused by the export.

User 55ffa2f197

06-10-2011 12:38:11

I tried fowlloing to catch the MolExportException, but compiler compains the "unreachable catch block for MolException, the exception is never thrown from the try statment body". m.toFormat( "smiles") is the line caused the problem based on the error information.


 


try {
           tsmiles.add( m.toFormat( "smiles" ) );


} catch (MolExportException e) {
                                        
   }

ChemAxon 25dcd765a3

07-10-2011 08:10:49

Ooops you are right:


Exception in thread "main" 
java.lang.IllegalArgumentException:
chemaxon.marvin.io.MolExportException: The following atom cannot be
aromatic according to the SMILES definition: B




So it is not a  MolExportException but an IllegalArgumentException.


I was not careful enough. You should catch the IllegalArgumentException.


So the correct code is:


try{
        System.out.println(molecule.toFormat("smiles"));
} catch (IllegalArgumentException e){
// molecule cannot be exported to SMILES
}