Standardization of symmetrical double bonds

User c0e481a82c

06-06-2007 08:08:56

Hi,





I've noticed a couple of things about the standardizer and wondered if there was anything I could do about this.





Firstly, if I have an SD file with 2-methylbut-2-ene (symmetrical at one side of the double bond, therefore cis and trans do not apply) which does not have a crossed or "double either" double bond , but does not have a stereosearch flag set on the double bond either. Now, my understanding of the MDL world is that they would consider a double bond to be either by default, unless the stereosearch flag is set on that double bond. That's what we used to teach on the ISIS Base training courses when I worked there. So, I'm wondering why is is that the SMILES string I get from it has directions on it? If I look at the atom section of the CTAB, I notice that the 5th column (chirality) has a zero in it (meaning not stereo) until I put the stereoseach flag on (whereupon it becomes a one, meaning stereo). Shouldn't this mean that the SMILES string looks like CCC=C(C)C rather than CC\C=C(/C)C? Or is it that the geometry of the double bond, because the molecule is assumed to be for registration, taking the double bond information from the coordinates; after all, crossed bonds, double either bonds, and stereoflagged double bonds are all query features and therefore not for registration?





The next question concerns the q option of the outFormat of the jcf_Standardize function (e.g. "... outFormat:smiles:q"). I thought this should remove stereochemstry where the double bond has equivalent atoms? It doesn't appear to work for me in JChem 3.2.5. Also, is it possible to combine these options? I'd like to get the unique SMILES where the stereo-double bond information is removed if the double bond has 2 equivalent substituents on one side of the double bond; kind of "... outFormat:smiles:qu" or "... outFormat:smiles:uq". Is that possible?





Thanks for your help.





Regards,





Phil.

ChemAxon aa7c50abf8

06-06-2007 09:04:39

Hi Phil,





Just a blind shot at your problem with the q modifier of the outFormat option:





Is it not possible that it is related to the problem with the outFormat option you reported earlier (http://www.chemaxon.com/forum/ftopic2842.html)? As you may remember, due to a bug in JChem Cartridge, the outFormat option is ignored in general unless you use the workaround described in that earlier thread.





(The first part of your post will be addressed soon by one of my colleagues.)





Thanks


Peter

User c0e481a82c

06-06-2007 09:07:15

I guess it could well be. I did add the "cleaningTemplate:select ''null'' from dual" to the jcf_Standardize but I suppose it could well be related.





Regards,





Phil.

User c0e481a82c

06-06-2007 09:24:55

It does appear that this is related to the other bug. It's rather sensitive to how you get the unique SMILES to generate. I double checked my code and found that I didn't have that empty cleaningTemplate argument, but had got the unique SMILES generation to work the other way I mentioned in the previous case. As it turns out, with that empty argument in there, it does remove the cis/trans, and, it appears anyway, that I can combine them using smiles:uq or smiles:qu. Obviously, I'll need to test that against a few more compounds before I'm certain that I'm getting the real unique SMILES but at least I can remove those extra double bond lines.





Thanks for your help.





Regards,





Phil.

ChemAxon 25dcd765a3

06-06-2007 19:01:15

Hi Phil,





I tried to reproduce the situation you mentioned in the simplest case using molconvert, but I get different results:





scripts/molconvert smiles test.sdf


C\C(C)=C(\C)C


scripts/molconvert smiles test1.sdf


C\C(C)=C(\C)C








scripts/molconvert smiles:u test.sdf


CC(C)=C(C)C


scripts/molconvert smiles:u test1.sdf


CC(C)=C(C)C





So if you generate SMILES then the atom equivalences are not checked.


If you generate unique SMILES then the atom equivalences are checked.


If you don't want unique SMILES but you want the atom equivalences to be checked please use the 'q' option


(see http://www.chemaxon.com/marvin/doc/user/smiles-doc.html#options)





scripts/molconvert smiles:q test.sdf


CC(C)=C(C)C


scripts/molconvert smiles:q test1.sdf


CC(C)=C(C)C








I hope this helps,








Andras

User c0e481a82c

07-06-2007 07:26:57

Hi Andras,





I think there are a couple of things here. First, thanks for the update on the u/q switches. It's good to know that I just need to use u and that's it. Secondly, if you try these switches with jcf_standardize, you'll find that unless you use the smiles:u option in a specific way, it doesn't work. By that I mean it will either not run at all, or, not generate the unique SMILES. This relates to a previous case I submitted (http://www.chemaxon.com/forum/ftopic2842.html). What I was in fact seeing was that the u switch wasn't working because of the way the rest of the jcf_Standardize argument was built; if there's no cleaningTemplate argument, the smiles:u option isn't recognised correctly and either doesn't run at all, or runs but doesn't seem to remove the equivalent double bonds.





Thanks for your help. I think I have my answer; use smiles:u only, I don't need to use the q option as well, and use it with a cleaningTemplate argument, otherwise it won't work correctly.





Regards,





Phil.

ChemAxon aa7c50abf8

27-06-2007 14:51:03

JChem 3.2.7 has been released with the fix for the problem "jc(f)_standardize doesn't work when options are specified in addition to 'config'"