User c0e481a82c
06-06-2007 08:08:56
Hi,
I've noticed a couple of things about the standardizer and wondered if there was anything I could do about this.
Firstly, if I have an SD file with 2-methylbut-2-ene (symmetrical at one side of the double bond, therefore cis and trans do not apply) which does not have a crossed or "double either" double bond , but does not have a stereosearch flag set on the double bond either. Now, my understanding of the MDL world is that they would consider a double bond to be either by default, unless the stereosearch flag is set on that double bond. That's what we used to teach on the ISIS Base training courses when I worked there. So, I'm wondering why is is that the SMILES string I get from it has directions on it? If I look at the atom section of the CTAB, I notice that the 5th column (chirality) has a zero in it (meaning not stereo) until I put the stereoseach flag on (whereupon it becomes a one, meaning stereo). Shouldn't this mean that the SMILES string looks like CCC=C(C)C rather than CC\C=C(/C)C? Or is it that the geometry of the double bond, because the molecule is assumed to be for registration, taking the double bond information from the coordinates; after all, crossed bonds, double either bonds, and stereoflagged double bonds are all query features and therefore not for registration?
The next question concerns the q option of the outFormat of the jcf_Standardize function (e.g. "... outFormat:smiles:q"). I thought this should remove stereochemstry where the double bond has equivalent atoms? It doesn't appear to work for me in JChem 3.2.5. Also, is it possible to combine these options? I'd like to get the unique SMILES where the stereo-double bond information is removed if the double bond has 2 equivalent substituents on one side of the double bond; kind of "... outFormat:smiles:qu" or "... outFormat:smiles:uq". Is that possible?
Thanks for your help.
Regards,
Phil.
I've noticed a couple of things about the standardizer and wondered if there was anything I could do about this.
Firstly, if I have an SD file with 2-methylbut-2-ene (symmetrical at one side of the double bond, therefore cis and trans do not apply) which does not have a crossed or "double either" double bond , but does not have a stereosearch flag set on the double bond either. Now, my understanding of the MDL world is that they would consider a double bond to be either by default, unless the stereosearch flag is set on that double bond. That's what we used to teach on the ISIS Base training courses when I worked there. So, I'm wondering why is is that the SMILES string I get from it has directions on it? If I look at the atom section of the CTAB, I notice that the 5th column (chirality) has a zero in it (meaning not stereo) until I put the stereoseach flag on (whereupon it becomes a one, meaning stereo). Shouldn't this mean that the SMILES string looks like CCC=C(C)C rather than CC\C=C(/C)C? Or is it that the geometry of the double bond, because the molecule is assumed to be for registration, taking the double bond information from the coordinates; after all, crossed bonds, double either bonds, and stereoflagged double bonds are all query features and therefore not for registration?
The next question concerns the q option of the outFormat of the jcf_Standardize function (e.g. "... outFormat:smiles:q"). I thought this should remove stereochemstry where the double bond has equivalent atoms? It doesn't appear to work for me in JChem 3.2.5. Also, is it possible to combine these options? I'd like to get the unique SMILES where the stereo-double bond information is removed if the double bond has 2 equivalent substituents on one side of the double bond; kind of "... outFormat:smiles:qu" or "... outFormat:smiles:uq". Is that possible?
Thanks for your help.
Regards,
Phil.