Substract a substructure from a structure

User 820e1cd6b2

26-10-2010 14:22:09

Hi,

I would like to substract a substructure from a structure and keep the remaining parts. I don't calculate the substructure with Chemaxon code but get it from somewhere else. Here is a sample code (considering mcs is a maximum common substructure from a set of molecules including mol1) , which does not work:

       Molecule mcs = MolImporter.importMol("O=Cc1ccccc1");
       Molecule mol1 = MolImporter.importMol("OC(=O)c1cccc(c1)Cl)";

       MolAtom[] atoms = mcs.getAtomArray();
       for(int i = 0; i<atoms.length; i++){
           mol1.removeNode(atoms,0);
       }
       System.out.println("Mol: " + mol1.exportToFormat("SMILES"));


Considering the SMILES output("OC(=O)c1cccc(c1)Cl"), the atoms haven't been substracted from the molecules ABC.  Any thoughts?


Best regards,

User c31567e5e3

27-10-2010 05:20:28

If you're getting the mcs from elsewhere, then you should consider generate it with the appropriate atom mapping in the output.  Otherwise, you have to use MolSearch to find the atom mapping yourself; e.g., something like this


      Molecule mcs = MolImporter.importMol("O=Cc1ccccc1");
Molecule mol1 = MolImporter.importMol("OC(=O)c1cccc(c1)Cl");

MolSearch msearch = new MolSearch ();
msearch.setQuery(mcs);
msearch.setTarget(mol1);

int[] hit = msearch.findFirst();
if (hit != null) {
List<MolAtom> remove = new ArrayList<MolAtom>();
for(int i = 0; i<hit.length; i++){
remove.add(mol1.getAtom(hit));
}
for (MolAtom atom : remove) {
mol1.removeNode(atom);
}
}
System.out.println("Mol: " + mol1.exportToFormat("SMILES"));

Cheers

User 820e1cd6b2

27-10-2010 12:39:05

Hi,

thank you! That works for me! Is there a way to get information at which position of the removed substructure the disconnected Molecule fragments were located, e.g., in form of an array  storing the id of the atom at which it was positioned before the removal of the substructure)? This would be helpful to compare the differences (e.g. the location of the atoms attached to the common substructure) between molecules sharing a common substructure.

Thank you and best regards

User c31567e5e3

27-10-2010 16:39:38

I think this is pretty straightforward (unless I misunderstood you).  The best way is to use set/getAtomMap method.  Here is the reworked code to keep the positions of the mcs core where the attachments occur.  Note that atom maps are 1-based, so you should -1 when trying to index into the molecule.


	  Set<MolAtom> remove = new HashSet<MolAtom>();
for(int i = 0; i < hit.length; i++){
MolAtom atom = mol1.getAtom(hit);
atom.setAtomMap(i+1); // save position of mcs 1-based
remove.add(atom);
}
for (MolAtom atom : remove) {
for (int i = 0; i < atom.getBondCount(); ++i) {
MolBond bond = atom.getBond(i);
MolAtom xatom = bond.getOtherAtom(atom);
if (!remove.contains(xatom)) {
xatom.setAtomMap(atom.getAtomMap());
}
}
mol1.removeNode(atom);
}

ChemAxon 4a2fc68cd1

29-10-2010 23:57:16

Hi,


I agree with the answersIf you get the MCS from elsewhere, then you have to search it in the target molecule. (Note that distinct Molecule objects have distinct sets of MolAtom and MolBond objects and you need a map to set up a connection between them.)


By the way, you may interested in ChemAxon's MCS search tools, as well. The upcomming new release 5.4 will contain an improved MCS algorithm, which can find disconnected common substructures, as well. Using this tool, your task would be easier. For example, it provides member functions for obtaining the matched and unmatched atoms and bonds.


Are you interested in beta testing of release 5.4?


Best regards,
Peter