User 677b9c22ff
04-11-2008 19:35:08
HI,
i tested all stereoisomers from an
inositol OC1C(O)C(O)C(OC2C(O)C(O)C(O)C(O)C2O)C(O)C1O and got 532. The
number is wrong and lower (528), I will open a ticket for that later.
If I calculate the names with cxcalc name inositols-532.smi I get:
Code: |
O[C@H]1[C@@H](O)[C@@H](O)[C@@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@H]2O)[C@H](O)[C@@H]1O
O[C@H]1[C@@H](O)[C@@H](O)[C@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@H]2O)[C@H](O)[C@@H]1O
|
Code: |
527 (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
528 (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
|
which is the same name (possible if the name algo is a strong canonizer).
But in Marvin 5.0.1 (which has confirmed stereogen issues) I get
different names:
Code: |
Preferred IUPAC Name = (1R,2R,3S,4R,5S,6S)-6-{[(2R,3R,5R,6R)-2,3,4,5,6-
pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-
pentol
Preferred IUPAC Name = (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-
pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-
pentol
|
I also checked different options (single mode , non IUPAC) etc.
But cxcalc generates different names.
I attached two files, one with the smiles and one with the generated names.
Tobias
ChemAxon e7b9408ca1
05-11-2008 14:25:13
Dear Tobias,
As far as I can see the situation is as follows. The molecules are the same, in particular they have the same chirality, they are just represented differently. Both names generated by Marvin 5.0 are correct (they depend on which cycle is chosen to be the parent). For implementation reasons, Marvin 5.1 generates the same name in both cases.
In general, do not expect that the "same" molecule will be given only one name. The IUPAC standard specifically makes several names acceptable is countless cases. There is a draft specification from IUPAC that tries to assign a preferred name is such cases, but
1. it is only a draft
2. it does not cover all cases (yet). For instance I could not find any rule saying which of those two names is preferred.
Is this satisfactory?
User 677b9c22ff
05-11-2008 18:12:50
Hi Daniel,
The substances are
not the same, this is a (possibly severe) error in the naming algorithm. See also
Inositol naming bug.
Unique SMILES (are different):
O[C@H]1[C@@H](O)[C@@H](O)[C
@@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@H]2O)[C@H](O)[C@@H]1O
O[C@H]1[C@@H](O)[C@@H](O)[C
@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@H]2O)[C@H](O)[C@@H]1O
Code: |
cxcalc name "O[C@H]1[C@@H](O)[C@@H](O)[C@@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@
H]2O)[C@H](O)[C@@H]1O"
id Preferred IUPAC Name
1 (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
cxcalc name "O[C@H]1[C@@H](O)[C@@H](O)[C@H](O[C@H]2[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)[C@
H]2O)[C@H](O)[C@@H]1O"
id Preferred IUPAC Name
1 (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
|
Now from Marvin, after the SMILES are cononized,
the SMILES trings have a different string length,
that means they are not the same.
However now the names generated are the same.
So either the SMILES canonizer is broken, which would be really really
bad or the name generator has this mentioned bug.
Code: |
Preferred IUPAC Name = (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
Preferred IUPAC Name = (1R,2R,4R,5R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
|
The substances are not the same. See attached picture.
For whatever reason I always thought that IUPAC Names
are canonized names, so wherever the algorithm starts,
it should find the same name. That is not the case here.
Actually the structure name always could be generated from
the uniuqe (canonized) SMILES. In this way such future mixups would be avoided. I am not sure if this is feasible.
Cheers
Tobias
ChemAxon e7b9408ca1
07-11-2008 15:00:10
Dear Tobias,
You are absolutely right, the molecules are indeed different and should have different names. The reason for this is that marvin in general does not detect those topological differences as chiralities 'r' and 's' (the way it does for the other atoms with 'R' and 'S'). Once it does, which is currently planned for release 5.2, these chiralities will also be included in the generated name, and will solve in particular the issue you reported.
Best regards,
Daniel
User 677b9c22ff
17-11-2008 20:28:19
Hi,
thanks Daniel. It must be quite complicated, given the mess
with names one can find in a plethora of publications. Therefore
InChI might be a good solution.
Tobias
ChemAxon e7b9408ca1
18-11-2008 11:13:50
Yes, InChI (or smiles) are easier ways to generate unique strings identifying structures. But names can be better at conveying a human-readable sense of the nature of the structure. In this case, as soon as marvin supports (r) and (s) stereo information, the name will be unique as well.
ChemAxon e7b9408ca1
18-10-2013 12:15:18
Tobias,
This comes late, but I'm happy to report that since version 5.11 our stereochemistry engine fully supports these cases. The generated names are now:
(1R,2R,3S,4R,5R,6S)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
(1R,2R,3R,4R,5R,6R)-6-{[(1S,2R,3R,4S,5R,6S)-2,3,4,5,6-pentahydroxycyclohexyl]oxy}cyclohexane-1,2,3,4,5-pentol
User 677b9c22ff
08-11-2013 05:25:05
Hi,
Rome wasn't built in a day. :-)
Thanks and Cheers
Tobias