MolPrinter does not print smiles string correctly using samp

User 76d45de7d2

08-07-2008 20:09:01

I changed the smiles string in the sample code to this:


OC1C(O)C(O)C(O[P](O)(=O)OCC(COC([H])=O)OC([C3H7])=O)C(O)C1O





It can be correctly displayed in PubMed (http://pubchem.ncbi.nlm.nih.gov/edit/index.html), but cannot be displayed properly using MolPrinter. See the attahcment for my screen shot.





Any one can help me with this?





import chemaxon.marvin.MolPrinter;


import chemaxon.struc.Molecule;


import chemaxon.formats.MolImporter;


import java.awt.Color;


import java.awt.Rectangle;


import java.awt.Graphics2D;


import java.awt.image.BufferedImage;


import java.io.File;


import java.io.IOException;


import javax.imageio.ImageIO;





public class MolPrinterTest {


static BufferedImage createTestImage() throws IOException {


// Create a molecule


Molecule mol = MolImporter.importMol("CN1C=NC2=C1C(=O)N(C)C(=O)N2C");


// Create a writable image


BufferedImage im = new BufferedImage(400, 400,


BufferedImage.TYPE_INT_ARGB);


Graphics2D g = im.createGraphics();


// Clear background


g.setColor(Color.white);


g.fillRect(0, 0, im.getWidth(), im.getHeight());


// Draw the bounding rectangle


g.setColor(Color.red);


Rectangle r = new Rectangle(20, 20, 360, 200);


g.draw(r);


// Paint the molecule


MolPrinter p = new MolPrinter(mol);


p.setScale(p.maxScale(r)); // fit image in the rectangle


p.paint(g, r);


return im;


}


public static void main(String[] args) throws Exception {


BufferedImage im = createTestImage();


ImageIO.write(im, "png", new File("test.png"));


}


}


ChemAxon 25dcd765a3

09-07-2008 05:50:21

Hi,





This is not a valid smiles string:


OC1C(O)C(O)C(O[P](O)(=O)OCC(COC([H])=O)OC([C3H7])=O)C(O)C1O


The problem is with this "atom":


[C3H7]


which is not accepted according to the smiles specification.


You may reformulate this [C3H7] to CCC


so your original smiles would look like:


OC1C(O)C(O)C(O[P](O)(=O)OCC(COC([H])=O)OC(CCC)=O)C(O)C1O


which is correctly imported.





Andras

ChemAxon 25dcd765a3

09-07-2008 06:00:05

Hi,


I have checked that PubChem accepts anything in square brackets.


So not just C[C3H7] is accepted which would have chemical meaning, but also C[LetsGo] which is surely not a valid smiles string.





Andras

User 76d45de7d2

09-07-2008 15:45:39

volfi wrote:
Hi,


I have checked that PubChem accepts anything in square brackets.


So not just C[C3H7] is accepted which would have chemical meaning, but also C[LetsGo] which is surely not a valid smiles string.





Andras
Dear Andras,





Really appreciate your quick response.





* In Daylight's official SMILES document, any strings between a pair of square brackets ("[" and "]") are known as pseudoatoms. "C3H7" and "C18H35" are perfectly legitimate pseudoatoms, "LetsGo" is also OK to be viewed as a pseudoatom. So the PubChem viewer is correct to allow the flexibility of pseudoatoms.





* Marvin Beans developers should be able to modify their Java source codes to behave in the same way as PubChem viewer.





If this problem cannot be solved in a short period, would you recommend an alternative way for me to do so?





Best,





Tom

ChemAxon a3d59b832c

14-07-2008 12:05:09

Sorry for the delay, Volfi is on holiday, I try to answer in his absence.
toughdan wrote:



* In Daylight's official SMILES document, any strings between a pair of square brackets ("[" and "]") are known as pseudoatoms. "C3H7" and "C18H35" are perfectly legitimate pseudoatoms, "LetsGo" is also OK to be viewed as a pseudoatom. So the PubChem viewer is correct to allow the flexibility of pseudoatoms.
Could you give us the exact reference? I have not found any mention of it here, in the "official smiles document":


http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
width="90%" cellspacing="0" cellpadding="3" border="0" align="center"> toughdan wrote: If this problem cannot be solved in a short period, would you recommend an alternative way for me to do so? ChemAxon extended smiles can represent pseudo atoms.


For example:





[H]C(=O)OCC(COP(O)(=O)OC1C(O)C(O)C(O)C(O)C1O)OC(*)=O |$;;;;;;;;;;;;;;;;;;;;;;;;;C3H7_p;$|





http://www.chemaxon.com/marvin/help/formats/cxsmiles-doc.html





But to be honest, this is not a proper representation of the chemical structure. Marvin supports abbreviated groups also (to be drawn by typing the keyboard shortcut or using the Insert/Groups menu), as in the attached example. Abbreviated groups are properly representing the underlying chemical structure, and the abbreviated groups can be extracted, contracted, etc. Unfortunately, they cannot be represented directly as smiles, but instead the expanded formula is generated when converted to smiles:


[H]C(=O)OCC(COP(O)(=O)OC1C(O)C(O)C(O)C(O)C1O)OC(=O)CCC





Best regards,


Szabolcs