Possible bug in conversion to Smarts

User 7b0ee04e66

20-01-2011 10:55:35

Good morning


I may have found a possible bug with the conversion to extended smarts in Marvin 5.4.0.0


When I convert the molecule attached to Extended Smarts, I get


[$([#6]O)]-c1ccccc1 |$_R1;;;;;;$|


But this should be converted to


[$([#6]O[#6])]-c1ccccc1 |$_R1;;;;;;$|


If I want to retrieve the correct methoxy compounds.


Is this correct ?


 


Thanks,


Catherine

ChemAxon 25dcd765a3

21-01-2011 16:31:08

Dear Catherine,


Thank you for your report!


The correct Extended Smarts is:


[$([#8][#6)]-c1ccccc1 |$_R1;;;;;;$|


The molecule contains one Rgroup attached to a benzene ring.


The atom which connects to the benzene ring is an Oxygen atom, (It is important that the R1 "atom" is actually a structure. So we should imagine an O-C  to the R1 label.) So the first atom in the recursive SMARTS string should be Oxygen. The other atoms following the first Oxygen in the recursive SMARTS define the environment of this first Oxygen atom. In this case a Carbon atom. But if I take a closer look of this Oxygen atom, the environment is not just a Carbon atom but a Carbon atom at one side and a benzene ring at the other side. So the really correct SMARTS string would be:


[$([#8]([#6])C1=CC=CC=C1)]C1=CC=CC=C1 |$_R1;;;;;;$,c:3,5,t:1|


Actually we have noticed this discrepancy and plan to modify cxsmarts export for rgroups.


Moreover Rgroups can have more than one attachment points, not just one. Which is impossible to handle in SMARTS according to my knowledge. The new cxsmarts export will solve this problem too.