Multiple Source view issue for InchiKeys and smiles

User 677b9c22ff

23-02-2015 20:01:15

Hi,


In case of multiple molecules the MarvinSketch --> Edit --> View Source  correctly shows the names


of multiple molecules separated by semicolon. However the Inchikey view for a single molecule


creates a false hybrid.


Example molecules in Marvinsketch, just drawn, no reactions.


[H]CC([H])C(=[O])[O][H] 
[H][O]CC([H])C([H])=[O]
[H]CC([O][H])C([H])=[O]

Name with single molecule view (correct)


2-hydroxypropanal; 3-hydroxypropanal; propanoic acid


SMILES says:


Cannot convert molecule to 'smiles' format



Inchikey shows incorrectly a new hybrid:


InChIKey=UWZBAXALOBOTRL-HIWQHCIONA-N



When "view as multiple molecules" is turned on it shows the three correct InchiKeys


InChIKey=XBDQKXXYIPTUBI-UHFFFAOYNA-N
InChIKey=AKXKFZDCRYJKTF-UHFFFAOYNA-N
InChIKey=BSABBBMNWQWLLU-YEQUQEJONA-N


So I believe from the logical worklflow/thinking the InChIKey view is false, because it creates a hybrid molecule.


The SMILES view could be shown in a similar style like the names with multiple smiles separated by semicolon


The InChIKey could show the same separated with semicolon or CR/LF in multiple view.


 


This is MarvinSketch 6.05


Cheers


Tobias

ChemAxon d26931946c

24-02-2015 13:55:30

Hi Tobias,


 


In our latest version, SMILES of multiple fragments are separated by dot: C1CCCCC1.C1=CC=CC=C1


This is part of the SMILES standard.


In case of names we use semicolon as this can be identified without any problem as names don't contain that character.


We don't know about such a standard separator in the InChi or InChiKey standard, and
using semicolon or newline character during export may break other
parsers which are expecting a single line output for InChiKey.


 


BRs


Peter

User 677b9c22ff

24-02-2015 18:55:26

Hi,


I think its a bug, I am concerned about the false InchiKey it will create not the SMILES.


In case of the multiple molecule view is not turned on, it should say, can not create InchiKey


or automatically switch to multiview.


 


Its  interesting that this came up already 5 years ago


https://www.chemaxon.com/forum/viewpost28831.html


 


Minimum could be a warning "can not create Inchikey, switch to multiple view".


It is implemented for SMILES, why not for InchiKey.


 


Funnily Pubchem has the Mixture entry for methane ethane propane


https://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?sid=224623552


SFROHDSJNZWBTF-UHFFFAOYSA-N


 


The inchi code can represent mixtures (separated by semicolons), InchiKey mixtures (no idea if intended)


http://www.inchi-trust.org/technical-faq/#5.7


 


So maybe the warning should actually say, do you want to represent the mixture


or do you want multiple compound view. From a usability standpoint with the online option


to search Inchikeys in Chemicalize and chemspider it would be prudent to do so.


 


I only stumbled across that after I was not able to find very common substances, and for


all of the compounds it was the same InchiKey.


Cheers


Tobias




 

ChemAxon d26931946c

23-03-2015 08:41:58

Hi Tobias,



I think we have the correct behavior in our latest release.

For the molecule: CC.CCC.CCCC

We generate the InChi code: InChI=1S/C4H10.C3H8.C2H6/c1-3-4-2;1-3-2;1-2/h3-4H2,1-2H3;3H2,1-2H3;1-2H3

This is correct as the fragments are represented in the code. The InChiKey generated for this molecule is: InChIKey=SFROHDSJNZWBTF-UHFFFAOYSA-N

And I think this is correct too, as the InChiKey documentation says the InChiKey is just a hash calculated from the corresponding InChi.

Best regards,

Peter

User 677b9c22ff

25-03-2015 18:37:12

OK.


Thanks.


Tobias