User 4df9fb85ce
06-06-2014 22:17:00
Hello,
I've been working with a set of molecules without coordinates in CML format. This dataset contains both SMILES and CML files for each molecule, but the problem is that Marvin treats some stereocenters in such CML files differently when comparing it to other cheminformatics packages. In may dataset the CML and SMILES fields should be the same, but Marvin treats them differently. I tried to localize the problem and found that atom ordering in CML influences stereocenter parity by some reason.
I took the following CML file:
<?xml version="1.0" ?>
<cml>
<molecule>
<atomArray>
<atom id="a0" elementType="C"/>
<atom id="a1" elementType="C"/>
<atom id="a2" elementType="C" isotope="13" isotopeNumber="13">
<atomParity atomRefs4="a3 a1 a4 a2">1</atomParity>
</atom>
<atom id="a3" elementType="C"/>
<atom id="a4" elementType="C" isotope="14" isotopeNumber="14">
<atomParity atomRefs4="a2 a8 a7 a4">1</atomParity>
</atom>
<atom id="a5" elementType="N"/>
<atom id="a6" elementType="C"/>
<atom id="a7" elementType="O"/>
<atom id="a8" elementType="C"/>
<atom id="a9" elementType="O"/>
</atomArray>
<bondArray>
<bond atomRefs2="a0 a1" order="1"/>
<bond atomRefs2="a1 a2" order="1"/>
<bond atomRefs2="a2 a3" order="1"/>
<bond atomRefs2="a3 a5" order="1"/>
<bond atomRefs2="a5 a6" order="1"/>
<bond atomRefs2="a6 a7" order="1"/>
<bond atomRefs2="a7 a4" order="1"/>
<bond atomRefs2="a2 a4" order="1"/>
<bond atomRefs2="a4 a8" order="1"/>
<bond atomRefs2="a8 a9" order="2"/>
</bondArray>
</molecule>
</cml>
In that file I changed order for atoms a7 and a8. From
<atom id="a7" elementType="O"/>
<atom id="a8" elementType="C"/>
<atom id="a8" elementType="C"/>
<atom id="a7" elementType="O"/>
Marvin gave me two different molecules: {mol_order1.png} and {mol_order2.png}
I compared Marvin results with OpenBabel and Indigo, and they both produces the same SMILES and Image for this molecules:
CC[13C@@H]1CNCO[14C@@H]1C=O
Marvin produces two different molecules for {mol_order1.cml} and {mol_order2.cml}:
CC[13C@@H]1CNCO[14C@H]1C=O
CC[13C@@H]1CNCO[14C@@H]1C=O
I looked at the CML specification, but couldn't find anything that can explain differences in these CML molecules. Could you check if this is a bug?
PS: I found a remark in your documentation about CML: http://www.chemaxon.com/marvin/help/formats/cml-doc.html
Attention: When a cml file containing parity information is imported to Marvin older than 5.8, the parity information will be displayed wrongly!
I'm using Marvin 6.3, but it seems that the parity information is displayed wrongly. Or did you assume that the version less than 5.8 worked as I expect to be correct, but you realized that is a bug. If so, then could you explain why the molecules {mol_order1.cml} and {mol_order2.cml} are different?
Best regards,
Michael