Duplicate structure issue

User 8a7878ec6d

14-07-2013 09:28:25

Hi,


When compiling a database from a subset of PubChem, I performed an overlap analysis in order to identify duplicate structures. PubChem compound 5312911 was perceived as a duplicate of both 5312912 and 5312913, while it is in fact a totally different structure (5312911 has no methyl between the carbonyl and double bond).


Indeed, if I create a new structure table containing 5312912 and 5312913 only, with duplicate filtering and tautomer duplicate checking activated, it will not allow me to add 5312911.


Can you reproduce this? I am attaching my structures for your reference.


This is on IJC 6.0.2, 64-bits Windows.

ChemAxon 2bdd02d1e5

15-07-2013 13:36:13

Hi,


I can't reproduce it with local DB. Do you have any standardizers defined on the entity?


Thanks for you report! 

ChemAxon 2bdd02d1e5

15-07-2013 13:40:34

Yes I can reproduce it, the bug is in tautomer duplicate checking feature.


Thanks again for the report!

User 8a7878ec6d

15-07-2013 13:46:10

Hi,


For the sake of completeness: this is in a Derby local database, and I have no standardization defined.


Cheers,


Evert

ChemAxon abe887c64e

16-10-2013 16:16:37

Hi Evert,


Our tautomer duplicate search compares the generic tautomers of the structures. In the examples you sent, all the three structures have the same generic tautomer form, this is why we identify them as tautomers of each other.


See our documentation about tautomers: https://www.chemaxon.com/marvin/help/calculations/tautomers.html


In Marvin Sketch you can generate generic tautomer from Calculations->Isomers->Tautomers.


Best regards,


Krisztina