RGroupDecomposition API usage example

08-09-2005 21:34:03

I have a list of target molecules and a query molecule with say two R-groups defined. Now I want to find out if each of my structures matches the query and if it does get an array (or collection) of two Molecule objects representing the ligands. Could you give me some hints how to do this using the RGroupDecomposition API?

ChemAxon fb166edcbd

08-09-2005 21:54:53

The RGroupDecomposition API returns matching ligands in


http://www.chemaxon.com/jchem/doc/api/chemaxon/sss/search/RGroupDecomposition.html#findLigands(int[]).


But this molecule array maps each query atom to a Molecule object, either one of the R-group ligands, or the scaffold, depending on the query atom. The R-group indexes of the query atoms can be easily determined anyway, but it is also returned in getQueryRMap().


Now, the ligand molecule array contains the scaffold at each index for which the query R-map is 0 (non-R-group node) and contains the corresponding R-group ligand at each index for which the query R-map is non-zero (R-group node).





I attach sample code that demostrates how to fetch the R-group ligands according to the above ideas. Example run (the map numbers represent corresponding R-group attachment points):





Code:



java RGDecompTest query.mol targets.sdf


Oc1cc(cc(C2CCC(Br)CC2)c1Cl)C3CCCC3


[OH2:1].C1CC[CH2:2]C1


BrC1CC[CH2:1]CC1.C1CC[CH2:2]C1


NCc1ccc(Cl)c(Oc2cc(cc(C3CCC(Br)CC3)c2Cl)C4CCCC4)c1


NCc1ccc(Cl)c(c1)[OH:1].C1CC[CH2:2]C1


BrC1CC[CH2:1]CC1.C1CC[CH2:2]C1


Clc1c(cc(cc1[OH:1])C2CCCC2)C3CCC(Br)CC3.N[CH3:2]


[H+:1].N[CH3:2]








Since now I see that this behaviour may be not very user-friendly, I am planning to add some other get-method to get the ligands.


It can be either a method that returns the ligands for a given R-group index or the same findLigands() method but setting null entries in place of scaffold structures. To return the R-group ligands in an array is not a good idea because then we do not know which ligand coresponds to which R-group. Any suggestions are welcome.

09-09-2005 10:13:00

Given a target molecule and a query mol, we would like to color the


target molecule so that the scaffold is displayed in blue, R1 in red and R2 in green.





Could you show us how that can be accomplished via your API? (basically we not only want to know what R1 is, but also where it is located in the original target molecule)

ChemAxon fb166edcbd

09-09-2005 10:31:22

The first API usage example in the


RGroupDecomposition


API class header shows some test code for this.





The idea is to call


http://www.chemaxon.com/jchem/doc/api/chemaxon/sss/search/RGroupDecomposition.html#findLigandIds(int[])


which returns an ID array (say 'ids') with ligand IDs for all target atoms:





Code:



ids[i] is the query atom index matching the R-group ligand attachment point if atom i in target is a ligand atom


ids[i] = -1 if atom i in target is a scaffold atom


ids[i] = -2 if atom i in target is outside the hit








This enables us to construct a color string where color codes are the R-group indexes. The API gives support to construct this string:


http://www.chemaxon.com/jchem/doc/api/chemaxon/sss/search/RGroupDecomposition.html#getTargetRMapString(int[])


Set this string in a (target) molecule property (SDF tag) for output and you can use mview with a Color-map config file to color target atoms according to this color map.





If you only need the color codes in an int[] array then use


http://www.chemaxon.com/jchem/doc/api/chemaxon/sss/search/RGroupDecomposition.html#getTargetRMap(int[]).





I attach a test program demonstrating this, run it by:





Code:



java RGDecompIDTest query.mol targets.sdf > o.sdf


mview -t DMAP -p Colors.txt o.sdf








or with piping:





Code:



java RGDecompIDTest query.mol targets.sdf | mview -t DMAP -p Colors.txt -


User 10a23c54c1

12-09-2005 00:58:31

What will be the easiest way of generating actual images with R-groups colored. For instance a search is performed against a list of target molecules with a query molecule consisting of two R-groups. The final result should be a list of say png images of target molecules but with R1s colored in red and R2s in green. It will be helpful if there is a working sample code for this.





Thanks


Hayk

User 10a23c54c1

12-09-2005 01:01:39

One more question: given an array of atom indexes of a molecule, is there an easy way to create a 'sub-molecule' consisting only of atoms with indexs in the input array?

ChemAxon fb166edcbd

12-09-2005 12:06:37

hayk wrote:
What will be the easiest way of generating actual images with R-groups colored. For instance a search is performed against a list of target molecules with a query molecule consisting of two R-groups. The final result should be a list of say png images of target molecules but with R1s colored in red and R2s in green. It will be helpful if there is a working sample code for this.


I attach sample code for this. The bad news is that the color scheme is not customizable yet - but we decided to add this option for the next major Marvin release. Atom set 1 is colored red and atom set 2 is colored green by default, therefore in the sample code I simply set the query R-group index as atom set index.





You can call RGroupDecomposition.findLigandIds(int[]) to get the IDs in and int[] array:


http://www.chemaxon.com/jchem/doc/api/chemaxon/sss/search/RGroupDecomposition.html#findLigandIds(int[]).





You can set the atom set numbers by


MolAtom.setSetSeq(int).





Then you can export the molecule to image by calling


Molecule.toBinFormat(String).


The output format in our case is set to "png:-a,mono,setcolors" to dearomatize, supress default CPK atom coloring and use the default atomset coloring instead.





Then the sample code saves the resulting byte[] arrays into separate .png files for each result hit.





Run the example by:





Code:



java RGDecompImageTest query.mol targets.sdf








The result files are imgX-Y.png where X is the target molecule index and T is the hit index.

ChemAxon fb166edcbd

12-09-2005 12:52:43

hayk wrote:
One more question: given an array of atom indexes of a molecule, is there an easy way to create a 'sub-molecule' consisting only of atoms with indexs in the input array?
I think the easiest way is to clone() the molecule and then remove the atoms that are not needed by removeNode(CNode).


In the sample code below, 'mol' is the origrinal molecule and 'needed' is a boolean array which is set to 'true' precisely for the atom indexes needed in the submolecule:








Code:



ArrayList remove = new ArrayList();


Molecule submol = (Molecule)mol.clone();


for (int i=0; i < needed.length; ++i) {


      if (!needed[i]) {


          remove.add(submol.getAtom(i));


      }


}


for (int i =remove.size()-1; i >=0; --i) {


      submol.removeNode((CNode)remove.get(i));


}








Note, that it is necessary to first collect the atoms to be removed and perform the actual removal afterwards because the atom indexes can be changed after atom removal and therefore the 'needed' array may become invalid. However, in general it is true that when an atom is removed, then all atoms with larger atom indexes are shifted while atoms with smaller atom indexes remain - you might use this fact and remove atoms according to decreasing atom index order right away without using the removal list - but this is undocumented behavior and is not even true for all cases, e.g. not true for reaction molecules.

ChemAxon fb166edcbd

09-12-2005 00:29:05

I have added 3 more methods to the RGroupDecomposition API,


these will be included in the next major JChem release (JChem 3.2).





This creates a single ligand or scaffold molecule for a given ligand ID (query R-group or scaffold atom index), without producing all ligand molecules. The implementation selects the target atoms contained in the specified ligand / scaffold and removes all other atoms from the target.


Code:



    /**


     * Returns the target submolecule corresponding to a ligand or scaffold.


     * The submolecule is part of the cloned target, not the original one.


     * @param ligandId is the ligand ID: the least attachment query rgroup atom index


     *                 corresponding to a ligand attachment,


     *                 <code>-1</code> for scaffold


     *                 (a member of the 'ligandIds' array)


     * @param hit is the search hit


     * @param ligandIds is the ligand ID array returned by {@link #findLigandIds(int[])}


     * @return the submolecule, <code>null</code> for invalid ligandId


     * @since JChem 3.2


     */


    public Molecule findLigand(int ligandId, int[] hit, int[] ligandIds);








These two methods determine the ligand ID(s) corresponding to a given R-group index. That is, maps the R-group index to the corresponding query atom index(es). You can use these methods to get the ligand ID to be set in the above findLigand() method.


Code:



    /**


     * Returns the ligand IDs (the least attachment R-group query indexes)


     * corresponding to a given R-group index.


     * This is only one ID (wrapped in a one-length array), except when


     * there are more R-group nodes with the same R-group index in the query


     * (e.g. there are two R1 nodes).


     * @param rindex is the R-group index


     *               (e.g. <code>1</code> for R1, <code>2</code> for R2)


     * @param hit is the search hit


     * @param ligandIds is the ligand ID array returned by {@link #findLigandIds(int[])}


     * @return the corresponding ligand IDs,


     *         an empty array if there is no r-group with the given 'rindex'


     * @since JChem 3.2


     */


    public int[] getLigandIds(int rindex, int[] hit, int[] ligandIds);





    /**


     * Returns the ligand ID (the attachment R-group query index)


     * corresponding to a given R-group index.


     * Use {@link #getLigandIds(int, int[], int[])} if there are more R-group nodes


     * with the same R-group index in the query (e.g. there are two R1 nodes).


     * @param rindex is the R-group index


     *               (e.g. <code>1</code> for R1, <code>2</code> for R2, <code>0</code> for scaffold)


     * @param hit is the search hit


     * @param ligandIds is the ligand ID array returned by {@link #findLigandIds(int[])}


     * @return the corresponding ligand ID, <code>-1</code> if 'rindex' is <code>0</code>


     *         (scaffold) or <code>-2</code> if there is no r-group with the given 'rindex'


     * @since JChem 3.2


     */


    public int getLigandId(int rindex, int[] hit, int[] ligandIds);








I attach some test code but it will run only from JChem 3.2.


Here is a sample run:


Code:



java RGDecompMolTest q.mol t.smiles





The result printout is in output.txt (attached).





Is this API extention appropriate for you?

ChemAxon fb166edcbd

24-12-2005 12:51:23

Nora wrote:
hayk wrote:
What will be the easiest way of generating actual images with R-groups colored. For instance a search is performed against a list of target molecules with a query molecule consisting of two R-groups. The final result should be a list of say png images of target molecules but with R1s colored in red and R2s in green. It will be helpful if there is a working sample code for this.


I attach sample code for this. The bad news is that the color scheme is not customizable yet - but we decided to add this option for the next major Marvin release.


Now we have added custom atomset coloring to our image export modules. I have updated the sample code: commented out the default coloring and added code for custom atomset coloring:





Code:



byte[] png = mol.toBinFormat("png:-a,mono,setcolors:a1:blue:a2:orange");








Run the example by:





Code:



java RGDecompImageTest query.mol targets.sdf








with the same test files.





The result files are imgX-Y.png where X is the target molecule index and T is the hit index. Atomset for R1 will be colored blue while atomset for R2 will be colored orange. Note, that this will be available from Marvin 4.1 (JChem 3.2).