Technical Support Forum Index
Technical Support Forum
Access ChemAxon scientists and developers here. For registration and login issues contact website support.

Support Ticket System is replacing forum

This forum was converted into a searchable archive. You cannot add posts here any more. For support please use our new Ticket System.

Create your first ticket
Leveraging chemaxon plugins to assess structure definition
To watch this topic for replies  Register (enables digests) or give email address:
This topic is locked: you cannot edit posts or make replies.
Display posts from previous:   
    View previous topic :: View next topic    
Author Message
Dennis

Joined: 24 Feb 2009
Posts: 54

View user's profile

Back to top
Link to postPosted: Tue Apr 20, 2010 3:53 pmPost subject: Leveraging chemaxon plugins to assess structure definition Reply with quote

Hello

We currently use chemaxon jchem tools for most of the processing for our compound registration system, and would like to do 100% of the processing using chemaxon (currently we bounce out to pipeline pilot for some additional processing).  One of the items we bounce out is what we call our "ambiguity detector."  This process takes a molecule and determines whether or not it has an undefined stereocenters and if so flags the molecule.  Our registration system is such that compounds receive an "A" if there is stereochemistry undefined and a "K" otherwise.  

In an attempt to replicate this behavior in chemaxon I have written some code (below) which determines the number of assymetric atoms in the molecule and then checks the chirality of each carbon (whether or not it is a R, S, niether, or UndefinedParity).  If there are no assymetric atoms, the compound is given a K.  If the count of R+S = assymetric atom count  OR there are no Carbons with UndefinedParity (this is a "3" chirality flag) the compound is marked a K.  Otherwise it is an "A."

This process seems to work pretty well except when the molecule has a assymetric nitrogen AND an adamante group.  Here the adamantane gets flagged as having a few carbons with undefinedparity (3) and the assymetric atom count does not match R+S count.  As you can see this logic is a bit clunky (and likely not the best way to proceed).  I was wondering if

1) there is a way to ask whether or not an individual atom is assymetric so that we can get a count of Carbons only

2) if you are developing anything which looks at a structure and determines it does not have defined stereochemistry


One thing I did not mention is that we would like to look at double bond stereochemistry in compounds as well and see if that is defined.  I believe I have a way of doing that which I did not describe here but I may wish to add this issue to this conversation. 

I have attached an sdf with some examples of the kinds of compounds I have been testing against and the results of the code below.  Thank you for any assistance you can provide.

 

public static boolean isAmbiguous(String smiles, String letter, List columns)
    {
        AssertUtil.a(StringUtils.isNotBlank(smiles), "smiles can't be blank");
        Molecule molecule;
        
            //convert smiles to molecule
            molecule = MolImporter.importMol(smiles);
            molecule.clean(2, null); // clean 2D
            //use topologyanlyser plugin to calculate a couple properties of the molecule
            
            TopologyAnalyser topologyPlugin = new TopologyAnalyser();
            topologyPlugin.setMolecule(molecule);
            int assymetricAtomCount = topologyPlugin.asymmetricAtomCount();
            int chiralCenterCount = topologyPlugin.chiralCenterCount();
            
            //call method to determine carbon ambiguity   
        boolean tetrahedralAmbiguous =  isTetrahedralAmbiguous(molecule, assymetricAtomCount);    
        return tetrahedralAmbiguous;
        
        
        
private static boolean isTetrahedralAmbiguous(Molecule molecule, int assymetricAtomCount)
    {
            boolean  undefinedParity = false;
            int RandS_Count = 0;
        
            if (assymetricAtomCount == 0)
        {
            return false;
        }
        else
        {

            int length = molecule.getAtomArray().length;
            for (int i = 0; i < length; i++)
            {
                
                MolAtom molAtom = molecule.getAtom(i);
                String symbol = molAtom.getSymbol();
                            if ("C".equals(symbol))
                            {
                                int chirality = molecule.getChirality(i);
                                
                                if (chirality == 16 || chirality == 8)
                                {
                                    ++RandS_Count;
                                }
                                else if (chirality == 3)
                                {
                                    undefinedParity = true;
                                }
                            }
            }
        }
            
        if (RandS_Count == assymetricAtomCount || undefinedParity == false)
        {
            return false;
        }
        else
        {
            return true;
        }
        }
       




 Filename: examples_of_ambiguous_compounds.sdf    Filesize: 10.67 KB    Downloaded: 229 Time(s)
 Description:  
Volfi
ChemAxon personnel
Joined: 07 Jun 2004
Posts: 996

View user's profile

Back to top
Link to postPosted: Wed Apr 21, 2010 11:59 amPost subject: Reply with quote

Hi,

Let's discuss the examples you have attached.

1- The molecule should get "K" as all chiral atoms have specified chirality value

2- The molecule should get "A" as the Carbon with atomic index 5 doesn't have specified chirality

3- The molecule has no chiral center, however it has cis trans stereoisomerism in ring. So the molecule itself could be chiral if the cyclohexane would have wedges.

4- I guess this would be the case you have mentioned "an assymetric nitrogen AND an adamante group".

The adamantane is a rigid ring system so the chirality for substituted adamantane systems can be guessed out. This algorithm is already in our plans.

"Here the adamantane gets flagged as having a few carbons with undefinedparity (3)"

Yes you are right the atoms 4, 6, 8, 10 gets undefinedparity which is a bug. These atoms should get 0 chirality value. (This bug has no connection to the presence of N atom.)

Should get letter "K".

5- Atom indexes 4 and 10 should get undefinedparity other atoms should get 0. Should get "A"

6- Seems that it should get letter "K".

Is this what you would expect?

Andras

Dennis

Joined: 24 Feb 2009
Posts: 54

View user's profile

Back to top
Link to postPosted: Wed Apr 21, 2010 1:54 pmPost subject: Reply with quote

Andras

Thank you very much for your reply. 

You analysis is pretty much 100% inline as to what I was expecting. I am happy to hear that you are aware of the bug with respect to structure 4.  I believe that fix will solve my issue, is there an idea when that will be released? 

In truth after posting I went back and considered the possibility of not just evaluating Carbon atoms for chirality and may open it up to N, S, P as well.  It seems as though the functionality exposed handles this quite well already.

I have one other question I am hoping you can help me with.  I would like to access the doubleBondStereoisomerCount through the API, but it appears as though the Stereochemical plugin does not expose this functionality. Is there something I am missing? Are there any plans to expose this more advanced functionality?

Thank you again, as always you guys are most helpful

Dennis

 

 

Zsolt
ChemAxon personnel
Joined: 11 Jan 2006
Posts: 1163

View user's profile

Back to top
Link to postPosted: Thu Apr 22, 2010 10:43 amPost subject: Reply with quote

dmoccia wrote:

I have one other question I am hoping you can help me with.  I would like to access the doubleBondStereoisomerCount through the API, but it appears as though the Stereochemical plugin does not expose this functionality. Is there something I am missing? Are there any plans to expose this more advanced functionality?

To get the doubleBondStereoisomerCount through the API set StereoisomerPlugin.setStereoisomerismType(int) to StereoisomerPlugin.DOUBLE_BOND and then call the StereoisomerPlugin.getStereoisomerCount() method.

Zsolt

Volfi
ChemAxon personnel
Joined: 07 Jun 2004
Posts: 996

View user's profile

Back to top
Link to postPosted: Thu Apr 22, 2010 11:43 amPost subject: Reply with quote

Hi Dennis,

I'm currently revise / rewrite the stereochemical recognition in Marvin.

It will be (hopefully) ready in 5.4

And the chirality recognition is supported for other atoms not just Carbon.

For example Nitrogen is under normal circumstances not a stereocenter since it is flexible
enough to invert but if the atom itself is in ring and all its' ligands
are also in ring smaller than size 12 this flexibility vanish.

All the best

Andras

Dennis

Joined: 24 Feb 2009
Posts: 54

View user's profile

Back to top
Link to postPosted: Thu Apr 22, 2010 6:24 pmPost subject: Reply with quote

Andras & Zsolt


Thank you for the responses.  I will look forward to the patch.


Following your adivce on the doubleBondStereoisomerCount I was able to get at the counts.  However I have noticed some discrepancy between IJC and calling the stereoDoubleBondCount() from the topology plugin. 

In IJC I see

[H]\C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1\C(=O)Oc2ccccc2C1=O   stereoDoubleBondCount() = 1

[H]C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1C(=O)Oc2ccccc2C1=O   stereoDoubleBondCount() = 0

but when I call the count in the following code

CODE START

TopologyAnalyser topologyPlugin = new TopologyAnalyser();
topologyPlugin.setMolecule(molecule);           

int stereoDoubleBondCount = topologyPlugin.stereoDoubleBondCount(); log.info("StereoDoubleBondCount = " + stereoDoubleBondCount);


CODE END

I receive a  stereoDoubleBondCount() = 0 for both instances.  Any ideas why this might be the case?

Also the rules for counting a  stereoDoubleBondCount() seem a bit odd and I was hoping you may be able to ellaborate on them.  I am not sure why the first example has a count of 0, and then 2 doublebondStereoisomers are calculated in IJC.

In IJC

Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1

doublebondStereoisomerCount() = 2

stereoDoubleBondCount() = 0

 

C\C(=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1)c1ccc2ccccc2c1

doublebondStereoisomerCount() = 2

stereoDoubleBondCount() = 1

 

In using the topolgy plugin and the following code for the doublebond stereoisomer count (below)

Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 0

C\C(=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1)c1ccc2ccccc2c1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 1

 

CODE START

StereoisomerPlugin isomerPlugin = new StereoisomerPlugin();
        //set input molecule
        isomerPlugin.setMolecule(molecule);
        
        //set plugin parameters
        isomerPlugin.setStereoisomerismType(2);
        isomerPlugin.setCheck3DStereo(true);              
      isomerPlugin.setIn3D(true);  
        log.info("here");
        //run plugin
        isomerPlugin.run();
        //get count
         int stereoisomerCount = isomerPlugin.getStereoisomerCount();

CODE END

I can provide more examples if you wish,

Thank you again for all your help

Dennis

 

Zsolt
ChemAxon personnel
Joined: 11 Jan 2006
Posts: 1163

View user's profile

Back to top
Link to postPosted: Fri Apr 23, 2010 5:39 pmPost subject: Reply with quote

dmoccia wrote:


Following your adivce on the doubleBondStereoisomerCount I was able to get at the counts.  However I have noticed some discrepancy between IJC and calling the stereoDoubleBondCount() from the topology plugin. 

In IJC I see

[H]\C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1\C(=O)Oc2ccccc2C1=O   stereoDoubleBondCount() = 1

[H]C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1C(=O)Oc2ccccc2C1=O   stereoDoubleBondCount() = 0

but when I call the count in the following code

CODE START

TopologyAnalyser topologyPlugin = new TopologyAnalyser();
topologyPlugin.setMolecule(molecule);           

int stereoDoubleBondCount = topologyPlugin.stereoDoubleBondCount(); log.info("StereoDoubleBondCount = " + stereoDoubleBondCount);


CODE END

I receive a  stereoDoubleBondCount() = 0 for both instances.  Any ideas why this might be the case?

Dennis, I get the same results with the API (Marvin 5.3.2), as in IJC.

My code:

public class TopologyAnalyserTest {

private static final String[] MOLS = new String[] {
"[H]\\C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1\\C(=O)Oc2ccccc2C1=O",
"[H]C(c1c(-c2ccccc2)n(C)c2ccccc12)=C1C(=O)Oc2ccccc2C1=O"
};

public static void main(String[] args) throws Exception {
TopologyAnalyser topologyPlugin = new TopologyAnalyser();
topologyPlugin.setMolecule(MolImporter.importMol(MOLS[0]));
System.out.println("1. StereoDoubleBondCount = " + topologyPlugin.stereoDoubleBondCount());
topologyPlugin.setMolecule(MolImporter.importMol(MOLS[1]));
System.out.println("2. StereoDoubleBondCount = " + topologyPlugin.stereoDoubleBondCount());
}
}

The output:

1. StereoDoubleBondCount = 1
2. StereoDoubleBondCount = 0

 Please attach your whole java code, so we can examine it.

dmoccia wrote:

Also the rules for counting a  stereoDoubleBondCount() seem a bit odd and I was hoping you may be able to ellaborate on them.  I am not sure why the first example has a count of 0, and then 2 doublebondStereoisomers are calculated in IJC.

In IJC

Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1

doublebondStereoisomerCount() = 2

stereoDoubleBondCount() = 0

 

C\C(=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1)c1ccc2ccccc2c1

doublebondStereoisomerCount() = 2

stereoDoubleBondCount() = 1

All values except the stereoDoubleBondCount() for molecule "Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1" seems to be OK. In this case TopologyAnalyser can not identify the double bond between the N and the aromatic carbon (in ring) as cis or trans double bond. We are working on the fix.

dmoccia wrote:

In using the topolgy plugin and the following code for the doublebond stereoisomer count (below)

Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 0

C\C(=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1)c1ccc2ccccc2c1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 1

 

CODE START

StereoisomerPlugin isomerPlugin = new StereoisomerPlugin();
        //set input molecule
        isomerPlugin.setMolecule(molecule);
        
        //set plugin parameters
        isomerPlugin.setStereoisomerismType(2);
        isomerPlugin.setCheck3DStereo(true);              
      isomerPlugin.setIn3D(true);  
        log.info("here");
        //run plugin
        isomerPlugin.run();
        //get count
         int stereoisomerCount = isomerPlugin.getStereoisomerCount();

CODE END

There is a bug in StereoisomerPlugin API, doublebondStereoisomerCount() should be 2 in both cases. MarvinSketch, cxcalc, Chemical Terms, and IJC is not affected by this bug. We will fix it.

Thanks for the detailed bug report.
Zsolt

Zsolt
ChemAxon personnel
Joined: 11 Jan 2006
Posts: 1163

View user's profile

Back to top
Link to postPosted: Wed Apr 28, 2010 8:47 amPost subject: Reply with quote


dmoccia wrote:

In using the topolgy plugin and the following code for the doublebond stereoisomer count (below)

Cn1c2ccc3ccccc3c2s\c1=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 0

C\C(=N\C(=O)c1ccc(cc1)S(=O)(=O)N1CCCCC1)c1ccc2ccccc2c1

doublebondStereoisomerCount() = 1

stereoDoubleBondCount() = 1

 

CODE START

StereoisomerPlugin isomerPlugin = new StereoisomerPlugin();
        //set input molecule
        isomerPlugin.setMolecule(molecule);
        
        //set plugin parameters
        isomerPlugin.setStereoisomerismType(2);
        isomerPlugin.setCheck3DStereo(true);              
      isomerPlugin.setIn3D(true);  
        log.info("here");
        //run plugin
        isomerPlugin.run();
        //get count
         int stereoisomerCount = isomerPlugin.getStereoisomerCount();

CODE END

There is a bug in StereoisomerPlugin API, doublebondStereoisomerCount() should be 2 in both cases. MarvinSketch, cxcalc, Chemical Terms, and IJC is not affected by this bug. We will fix it.

Thanks for the detailed bug report.
Zsolt

We identified the bug: by default StereoisomerPlugin.setProtectDoubleBondStereo(boolean) is set to true, so double bonds with specified cis or trans configuration are not allowed to change their stereo configuration. In Marvin 5.3.3 it will be fixed, the default will be false, as specified in the javadoc. Until then please insert the line

isomerPlugin.setProtectDoubleBondStereo(false);

into your code, and the API will return the same result as IJC. Code:

StereoisomerPlugin isomerPlugin = new StereoisomerPlugin();
//set input molecule
isomerPlugin.setMolecule(molecule);
        
//set plugin parameters
isomerPlugin.setStereoisomerismType(2);
isomerPlugin.setProtectDoubleBondStereo(false);
isomerPlugin.setCheck3DStereo(true);              
isomerPlugin.setIn3D(true);  
log.info("here");
//run plugin
isomerPlugin.run();
//get count
int stereoisomerCount = isomerPlugin.getStereoisomerCount();

Regards,
Zsolt



Last edited by Zsolt on Wed Apr 28, 2010 3:08 pm; edited 1 time in total
Dennis

Joined: 24 Feb 2009
Posts: 54

View user's profile

Back to top
Link to postPosted: Wed Apr 28, 2010 11:54 amPost subject: Reply with quote

Zsolt

Thank you for the update, I will make that change.  I still owe you my code using the Topology Plugin, I will post that in later today.

 

Dennis

This topic is locked: you cannot edit posts or make replies.
Page 1 of 1


To watch this topic for replies   Register (enables digests) or give email address  
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum