Tricky stereogeneration and naming problem.

User 677b9c22ff

28-08-2008 17:55:12

Hi,


I have a substance coming from here [PDF]. In the Pub its says there are 39 stereoisomers. Its a four ring and the smiles code is that:


CC(O)C1C(C(C)O)C(C(C)O)C1C(C)O





1) In Marvin 5.01 it generates 44 stereoisomers.


2) In Instant-JChem if i increase to maximum fingerprints and


bit settings it will remove three so the count is 41.


Code:



Structure 23 not imported. It is a duplicate of CD_ID 14 in the database.


Structure 27 not imported. It is a duplicate of CD_ID 18 in the database.


Structure 44 not imported. It is a duplicate of CD_ID 40 in the database.








3) I don't trust canonical smiles so I didn't check them


4) INCHI I would trust, but it says out of 44 there are only


2 doublets, so count 43


5) A usual good way is to generate the name,


the old naming in Instant-Jchem generates 38 uniqe names


6) The cxcalc name tools generates 26 unique names and the last 4 are duplicates (structures are different). See names in the EXCEL file.





A) So there is a naming problem for sure.


B) There is a stereo generation problem, which wouldn't be a problem if there was a quick way to find out.





See attached file.


Tobias

ChemAxon 25dcd765a3

29-08-2008 15:37:46

Dear Tobias,





You get 44 stereoisomers as the stereoisomer code uses unique smiles filtering.


We already know cases where some stereoisomers are the same, so not filtered out.


We will improve in this question in Marvin 5.2.


Thank you fro the report.





Andras