odd match behavior for badly drawn cycloalkenes

User 870ab5b546

04-02-2011 15:40:10

I don't know if this is a bug or a feature, but consider this target:


<?xml version="1.0" ?>
<cml>
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray
atomID="a1 a2 a3 a4 a5 a6 a7"
elementType="C C C C C C R"
sgroupRef="0 0 0 0 0 0 sg1"
x2="-1.9730666666666667 -3.3068000000000004 -3.3068000000000004 -1.9730666666666667 -0.6395200000000001 -0.6395200000000001 -0.6876450286102296"
y2="2.598773333333334 1.8287733333333336 0.2887733333333334 -0.4812266666666667 0.2887733333333334 1.8287733333333336 3.513148180745443"
/>
<bondArray>
<bond atomRefs2="a1 a2" order="1" />
<bond atomRefs2="a1 a6" order="1" />
<bond atomRefs2="a3 a4" order="1" />
<bond atomRefs2="a4 a5" order="1" />
<bond atomRefs2="a5 a6" order="2" />
<bond atomRefs2="a6 a7" order="1" />
<bond atomRefs2="a2 a3" order="1" />
</bondArray>
<molecule id="sg1" role="SuperatomSgroup" title="Ph" molID="m2">
<atomArray
atomID="a8 a9 a10 a11 a12 a13"
elementType="C C C C C C"
attachmentPoint="1 0 0 0 0 0"
sgroupAttachmentPoint="1 0 0 0 0 0"
x2="2.7016623082636517 2.701662308263649 1.3679831864356125 0.03430406460757762 0.034304064607578955 1.3679831864356142"
y2="9.095624980926512 7.5556249809265115 6.785624980926514 7.555624980926516 9.095624980926516 9.865624980926516"
/>
<bondArray>
<bond atomRefs2="a8 a9" order="2" />
<bond atomRefs2="a8 a13" order="1" />
<bond atomRefs2="a9 a10" order="1" />
<bond atomRefs2="a10 a11" order="2" />
<bond atomRefs2="a11 a12" order="1" />
<bond atomRefs2="a12 a13" order="2" />
</bondArray>
</molecule>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

This target fails to match the query C1CCC(=CC1)C1=CC=CC=C1.


However, if one turns on View E/Z stereo in Marvin, Marvin fails to show an E/Z marker for the cyclohexene ring in either the target or the query, implying that it assumes that the stereochemistry is E, even if the structure is badly drawn, as it is in the target.


Taken together, this behavior appears to be inconsistent.  


I would suggest that whether a cycloalkene is E or Z should be determined by the mutual orientation of the ring bonds, not any substituents outside the ring.  With this assumption, the target would be calculated as having E stereochemistry, and it would match the query, as one would expect.  


If you don't want to implement that change, then show E/Z stereochemistry for ring bonds.

ChemAxon 42004978e8

07-02-2011 14:04:46

Hi Bob,


 


The difference lies in the fact that mrv format contains 2D information while smiles doesn't. So the marv version is imported and display as specified while for the smiles version the coordinates around the double bond get adjusted according to the small ring membership. This procedure is not executed for the mrv version that stores the coordinates as the user wanted to save them.


You can correct the mrv format by executing 2D clean operation.


Bye,


Robert

User 870ab5b546

07-02-2011 19:16:22

I wrote the query in SMILES format simply to save space.  The described behavior occurs when I draw the query in MarvinSketch.  You can draw the target and query on this page and see the behavior.  (Choose duplicate search.)

ChemAxon 25dcd765a3

08-02-2011 09:18:21

Dear Bob,


I totally agree with your observations.


We have considered earlier this problem and plan to solve it returning the correct cis/trans information in small rings with ligands drawn ambiguously. The solution needs some type of ring detection which may slow down our code so it is not as easy as it looks at first sight.


Regarding the time frame, we cannot fix cis/trans detection until 5.5 as we are overloaded with tasks, just after 5.5. But we try to fix the E/Z detection (which is much slower than cis/trans detection anyway) in 5.5. (Marvin 5.4.1 is almost out, the commit deadline is over.)


Thank you for the report.


Andras