Correct number of stereoisomers - which approach?

User 677b9c22ff

04-11-2008 20:13:20

Hi,


for canonizing and counting non stereoisomers it is possible to generate smiles and aromatize them and the delete duplicates. In case of sugars


or other complex stereoisomers this is not possible.





Marvin 5.0.1 and cxcalc generate 532 stereoisomers for the Inositol


OC1C(O)C(O)C(OC2C(O)C(O)C(O)C(O)C2O)C(O)C1O


There is a confirmed issue in the 5.0.1 version which calculates wrong


stereoisomer numbers.





In the report Manual Construction and Mathematics- and Computer-Aided Counting of Stereoisomers.


The Example of Oligoinositols
[PDF]


Kerber, Gugisch and Ruecker write:


"Forming dimers from these 32 monomers, we obtain 32 homodimers and 32


ยท31/2 = 496 heterodimers. Thus there is a total of 528 stereoisomers of


formula 2."
- where number 2 is OC1C(O)C(O)C(OC2C(O)C(O)C(O)C(O)C2O)C(O)C1O











So just for me to remember there are three ways:


A) create unique smiles (canonized).





I used the Marvin SMILES and did a molconvert


Code:



Z:\>molconvert smiles:u inositols-532.smiles > inositols-532-uniquesmiles-u.smi


Z:\>molconvert smiles:q inositols-532.smiles > inositols-532-uniquesmiles-q.smi








which resulted in 528 unique stereoisomers.





B) Use Instant JChem with the highest fingerprint bit settings and


before import - create a table with the option "no duplicates"





Code:



Structure is mapped to current field Structure


Starting to import data...


Structure 393 not imported. It is a duplicate of CD_ID 32 in the database.


Structure 394 not imported. It is a duplicate of CD_ID 158 in the database.


Structure 531 not imported. It is a duplicate of CD_ID 500 in the database.


Structure 532 not imported. It is a duplicate of CD_ID 524 in the database.





Import completed in 161s.


528 entries successfully imported.


0 Errors.


4 were not imported as they were duplicates


Duplicate records can be found at


Z:\inositols-532_duplicates.smiles








C) If all the structures with duplicates are imported, use the Instant-JChem


overlap function with stereosiomers "yes" option to create an overlap on itself.





***


Is it possible to fix that problem so Marvin and cxcalc calculate the correct number of stereoisomers in the first place?


***





I attached several files with the names, the smiles and so on.


Cheers


Tobias

ChemAxon e08c317633

06-11-2008 17:17:23

Hi Tobias,





We use unique smiles for filtering out duplicate stereoisomers, but it seems there is a bug somewhere in the code. We will fix this, the fix can be expected in the 5.2 release.





Zsolt