How to find the right decomposition from rgroupdecomposition

User b9aa8da57b

11-12-2016 20:00:50

Hi, In my groupdecomposition use case where my query has the core of six membered ring, and R (R1-R6) are in counter clock wise. Using RgroupDecomposition class it can generates the hits with rgroups in counter clock wise. However among the 7 decompositions rgd identifies i am only interested in the one where the decomposition's target is not being rotated relative to my original target molecule. My original target molecule is in Mol format with 2D corrds, which chemists have always been drawn  in the same orientation to track the same series they are intereted. Can you let me know is there a way i can find the right decompostion where its target is not being rotated relative to my original target molecule, shift a constant on X and Y are OK. I have look at the parameters that can be used on SearchOptions, it is not clear if a parameter can be set to fish out such decomposition i want.


Thanks


Dong

ChemAxon 58554172c4

12-12-2016 14:48:38

Hi Dong, If you don't want to rotate your hits please try the --keep-coordinates parameter of R-group decomposition.


https://docs.chemaxon.com/display/docs/R-group+Decomposition+User's+Guide#R-groupDecompositionUser'sGuide-usage


If it it not solves your issue or I misunderstand your description, please send me some non-confidential examples.


Best regards,


Árpád

User b9aa8da57b

13-12-2016 17:05:30

Thanks for the answer. Though this option is for command line based rgroupdecomposition i think in RgroupDecomposition class it should have similar setting to set  --keep-coordinates. I cannot find this parameter in the RgroupDecomposition API. We are using this API to code our decomposition process instead of using the command line utility for the sake of flexibility ... Can you or someelse to illustrate how do you acturqaely coded the --keep-coorfinates, or this is something that is not available in RgroupDecomposition java class


Thanks


Dong

ChemAxon 4a2fc68cd1

14-12-2016 09:23:18

Hi Dong,


The RGroupDecomposition class also supports this option, but with different name: it is the setAlign() method. The code examples usually contain setAlign(true), so you probably also have this in your code. If you change it to setAlign(false) (or simply omit this method call), then the class will work in "keep coordinates" mode.


Best regards,
Péter Kovács

User b9aa8da57b

14-12-2016 14:30:21

Hi  Peter, thanks for the information.


When I instantiate the RgroupDecomposition object rgd, I have already set the setAlign(true), which all know it means the decomposed ligand/rgroup retains the 2D coords as it appears in the target molecule.


 


My question is by looking at the Decompositions, the target molecules are rotated relative to my input molecule, I call it original target. I am only interested in the decomposition where the target in the resultant decomposition object obtained from decomposition.getTarget() is not rotated relative to my original target.


I do not think there is a parameter can be set to fish this resultant decomposition object out.


If it is so can we have a teleconference to have further discussion on this issue, and I can show you real examples what I am after. BMS did pay for ChemAxon technical support


Regards


Dong

ChemAxon 58554172c4

14-12-2016 17:28:18

Hi Dong,


We can organize a TC, but please first send us some examples to [email protected] and continue


the discussion in e-mail.


Best regards,


Árpád

User b9aa8da57b

14-12-2016 18:17:41

great, will do

User b9aa8da57b

17-12-2016 18:32:45

Hi, After looking through methods for molecule, I felt that checking the length and height of the target molecules from decompositions against those from original target, should be able to crudely identify the decomposition I am after. It seems to me the decomposition should really rank the decomposition result where no rotation as the first, then the ones with rotations ... Currently if you try the decomposition module in jchem for excel you can see the results almost comes out at random in terms of the rotation of the target molecule. I am sure most of the users would want to see the one without rotation first ... and most of us deal with rgroups on phenyl rings primarily....


Following is my kludge approach to get myself moving, hope you can provide a better solution


.....



Decomposition d=rgd.findFirstDecomposition();

while(d!=null){

Molecule decompTarget=d.getTarget();

int decompTw=(int)Math.round(decompTarget.calcWidth());

int decompTh=(int)Math.round(decompTarget.calcHeight());

if(decompTw==targetWidth&&decompTh==targetHeight){

System.out.println(decompTarget.toFormat("mrv"));

        for(int i=0; i<d.getLigands().length;i++){

if(d.getQuery().getAtom(i).getAtno()==MolAtom.RGROUP){

Molecule groupMol=d.getLigands();

groupMol.dearomatize();

System.out.print(groupMol.toFormat("cxsmiles")+"\t");

}

}

System.out.println(); 

}

       d = rgd.findNextDecomposition();

}

ChemAxon 4a2fc68cd1

19-12-2016 11:22:04

Hi Dong,


Your solution helped us to understand your requirements, but it is not practical. It may not work correctly for some target molecules. For example, see the attached query and target structures.


Instead, I would suggest to compute an "alignment score" for each decomposition target compared to the original target orientation. Note that the decomposition targets are equivalent to the original target, only their alignment may differ, so we can describe the extent of rotation by the distance of the corresponding atoms in the two molecules.


Code example:


        RGroupDecomposition rgd = new RGroupDecomposition();
        rgd.getSearchOptions().setBridgingRAllowed(true);
        rgd.setQuery(query);
        rgd.setTarget(target);
        rgd.setAlign(true);

        Molecule origTarget = target.clone();
        Decomposition d = rgd.findFirstDecomposition();
        while (d != null) {
            Molecule decompTarget = d.getTarget();
            if (getAlignmentDifference(origTarget, decompTarget) < 1e-5) {
                // Decomposition found with the same orientation
                System.out.println(decompTarget.toFormat("mrv"));
                ...
            }
            d = rgd.findNextDecomposition();
        }

The scoring method:

public static double getAlignmentDifference(Molecule mol1, Molecule mol2) {
if (mol1.getAtomCount() != mol2.getAtomCount()) {
throw new IllegalArgumentException(
"Input molecules should be equivalent, but their alignment may differ.");
}

double diff = 0;
for (int i = 0; i < mol1.getAtomCount(); i++) {
MolAtom a1 = mol1.getAtom(i);
MolAtom a2 = mol2.getAtom(i);
diff += Math.pow(a1.getX() - a2.getX(), 2);
diff += Math.pow(a1.getY() - a2.getY(), 2);
diff += Math.pow(a1.getZ() - a2.getZ(), 2);
}

return diff;
}

 

I hope this helps. This solution should work for any kind of query and target.

Best regards,
Péter 

User b9aa8da57b

19-12-2016 13:58:01

Thanks for the solution.


Should you add getAlignmentDiff to Marvin API?


Should you rank the decomposition where the result with no rotation comes out first in RGroupDecomposition API?


Thanks


Dong

ChemAxon 4a2fc68cd1

19-12-2016 15:07:12

Hi,


Let me fix the previous code. It turned out that more careful computation is required for general inputs (because the target molecules are not only rotated, but they are moved as well).


The improved scoring method looks like this:


    public static double getAlignmentDifference(Molecule mol1, Molecule mol2) {
if (mol1.getAtomCount() != mol2.getAtomCount()) {
throw new IllegalArgumentException(
"Input molecules should be equivalent, but their alignment may differ.");
}

DPoint3 center1 = getCenterPoint(mol1);
DPoint3 center2 = getCenterPoint(mol2);

double diff = 0;
for (int i = 0; i < mol1.getAtomCount(); i++) {
DPoint3 p1 = DPoint3.subtract(mol1.getAtom(i).getLocation(), center1);
DPoint3 p2 = DPoint3.subtract(mol2.getAtom(i).getLocation(), center2);
diff += DPoint3.subtract(p1, p2).lengthSquare();
}

return diff;
}

public static DPoint3 getCenterPoint(Molecule mol) {
if (mol.getAtomCount() == 0) {
throw new IllegalArgumentException("Empty molecule");
}
DPoint3 result = new DPoint3();
for (DPoint3 point : mol.getPoints()) {
result.add(point);
}
result.scale(1.0 / mol.getAtomCount());
return result;
}




Also change the line

if (getAlignmentDifference(origTarget, decompTarget) < 1e-5)

to e.g.

if ((int) getAlignmentDifference(origTarget, decompTarget) == 0)

to make it safer. Or find the decomposition(s) with the minimum value(s) .

Regards,
Péter

ChemAxon 4a2fc68cd1

19-12-2016 15:29:25

Hi Dong,


We do not plan applying any ranking to RGroupDecomposition results, mainly because it would require the computation of all decompositions before even the first one can be returned, so it would make several use cases slower. 


Péter

User b9aa8da57b

19-12-2016 17:18:34

Hi Peter, thanks for the update


Can i just use calcCenter instead of calling your method. I did not compare the score though


public static double getAlignmentDifference(Molecule mol1, Molecule mol2) {


        double diff = 0.0;


        for (int i = 0; i < mol1.getAtomCount(); i++) {


            DPoint3 p1 = DPoint3.subtract(mol1.getAtom(i).getLocation(), mol1.calcCenter());


            DPoint3 p2 = DPoint3.subtract(mol2.getAtom(i).getLocation(), mol2.calcCenter());


            diff += DPoint3.subtract(p1, p2).lengthSquare();


        }


        return diff;


}

ChemAxon 4a2fc68cd1

19-12-2016 21:07:01

Hi Dong,


Yes, it is better to replace the getCenterPoint() method with the calcCenter() method of the molecule, but it is still practical to store the results in variables instead of calculating them multiple times.


Péter