chemaxon.sss.search.MCS

User 3898c01b63

11-08-2009 20:08:01

Dear Helper,


  I am trying to test chemaxon.sss.search.MCS to find maximum common sturcture for two molecules in JChem 5.2.3 with Windows XP.


Here is my code:


----------------------------------


  import chemaxon.formats.MolImporter;
  import chemaxon.formats.MolExporter;
  import chemaxon.struc.Molecule;
  import chemaxon.sss.search.MCS;
  import java.io.File;
  import java.io.IOException;

  public class MCSTest {

      public static void main(String[] args) {
          try {
             // create an MCS
             MCS mcs = new MCS();
             // load the input molecule from "fragment.mol"
             MolImporter importer = new MolImporter("struc2.smiles");
             Molecule mol = importer.read();
             importer.close();
             
             System.out.println( mol.getInputFormat() );
             System.out.println( mol.toFormat( "cxsmiles" ) );
             
             mcs.setQuery( mol );
             
             importer = new MolImporter("struc.smiles");
             mol = importer.read();
             importer.close();
             
             System.out.println( mol.getInputFormat() );
             System.out.println( mol.toFormat( "cxsmiles" ) );
             
             mcs.setTarget( mol );
             
             System.out.println( mcs.search() );
             
             int [] mcs_atoms = mcs.getResult();
             if( mcs_atoms != null ) System.out.println( "Result " + mcs_atoms.length );
             else System.out.println( "Result: null" );

          } catch (IOException e) {
             e.printStackTrace();
             System.exit(1);
          }//end catch
      }//end main
  }//end MCSTest


----------------------------------


The output of this simple program is:


----------------------------------


cxsmiles
ClC1CCCCC1
cxsmiles
C1CCCCC1
false
Result: null


----------------------------------


However, I am expecting to get C1CCCCC1 as the MCS.  What did I do wrong? 


In command line, mcs -q struc.smiles -t struc2.smiles -w yielded nothing.


Please help.  Thank you in advance.

ChemAxon efa1591b5a

12-08-2009 13:24:47

Dear User,


Thanks for using the mcs batch application. The default minimum MCS size is set to 9 atoms, this is why your 6 membered ring was not found. Use the -s 6 option in the command line to enable common structures smaller than 6 atoms.


Beside of that, a bug introduced by a recent modification to the algorithm may results in failure to find the MCS for small inputs but this affects only the default matching mode, which is 'exact matching'.


As a workaround, we recommend the use of the -m turbo option.


So, in case of your command line example 


mcs -q struc.smiles -t struc2.smiles -m turbo -s 6

should work and meet your expectations.


Similar modifications can be added to your java code, like


mcs.setMinimumCommonSize(6);

mcs.setMode(MCS.MODE_TURBO);

I hope this helps, and please be assured that the bug related to exact mode will be fixed asap.


Kind regards,


Miklos


 

User 3898c01b63

14-08-2009 20:10:17

Dear Miklos,


  Thank you for your kind reply and my testing program works.  This time I use longer molecules:


C1CCCCC1CCCCCCC (see struc.smiles) and CCCCCCC1CCCCC1Cl (see struc2.smiles).


mcs -q struc.smiles -t struc2.smiles -w gives reasonable results + a plot:


-----


C:\javaTry>mcs -q struc.smiles -t struc2.smiles -w
query atom -> target atom mapping:
1 -> 8
2 -> 9
3 -> 10
4 -> 11
5 -> 12
6 -> 7
7 -> 6
8 -> 5
9 -> 4
10 -> 3
11 -> 2
12 -> 1
query molecule bonds:
1-2
1-6
2-3
3-4
4-5
5-6
6-7
7-8
8-9
9-10
10-11
11-12
target molecule bonds:
8-9
7-8
9-10
10-11
11-12
7-12
6-7
5-6
4-5
3-4
2-3
1-2


-----


  The numbers shown above are easily assigned to original molecules.


I tried to use Java code (see MSCTest.java) with API to simulate the result, however, the output is hard to be interpreted.


-----


cxsmiles
CCCCCCC1CCCCC1Cl
cxsmiles
CCCCCCCC1CCCCC1
11
getResultQueryAtoms getResultTargetAtoms getResult
                  1                   10        -1
                  2                    9        10
                  3                    8         9
                  4                    7         8
                  5                    6         7
                  6                    5         6
                  7                    0         5
                  8                    1         0
                  9                    2         1
                 10                    3         2
                 11                    4         3
CCCCCC1CCCCC1
CCCCCC1CCCCC1


-----


  First of all, the number of atoms in the MCS is 11 instead of 12.  I know I miss something for mapping.  Would you please help me out.  What I am going to do is mock up command line mcs like C:\Program Files\ChemAxon\JChem\examples\sss\mcs.jsp from your compony.


  Thank you in advance.


  Tiqing Liu From www.bindingdb.org

ChemAxon efa1591b5a

25-08-2009 08:31:03

Dear Tiqing,


Your enquire is not forgotten or ignored - I apologise for not being able to deal with. I will thoroughly investigate what goes wrong when you calculate the MCS via the API, why are results different.


Thank you for your patience.


Kind regards,


Miklos


 

ChemAxon efa1591b5a

27-08-2009 08:57:54

Dear Tiqing,


I managed to look at the problem you reported in your recent post. Please accept my apologies for not being able to deal with this issue sooner.


I'm afraid that your test case revealed a weird bug. Notice, that the query and target (that is, 1st and 2nd structure) are swapped in the java code and this caused the loss of solution! This seems to be a really nasty bug, we haven't investigated it yet. Anyway, if you swap them in your java code, you get the same solution as with the command line application.


I suggest a quick-and-dirty workaround until this problem gets fixed: always run the MCS search twice, with your structures swapped in the second go. That's not much slower, hopefully, and increases the chance of finding the right solution (ie. the larger one of the two results).


I hope we will be able to find and fix the bug soon and the bug-fix can be released soon after.


Kind regards,


Miklos


 

User 3898c01b63

22-09-2009 14:15:28










mvargyas wrote:

I'm afraid that your test case revealed a weird bug.



Dear Miklos,


  We tested several molecules for MCS finding commandline MCS also has some problems with getResultQueryAtoms(), getResultTargetAtoms(), and getResult(), I guess.  1hvr_XK2_1_A_263__C__.mol is enclosed as target molecule, BindingDB_22.mol, BindingDB_105.mol, and BindingDB_153.mol are enclosed as query molecules.


1hvr_X2K_22_mcs.txt is also enclosed, which is produced by mcs -q BindingDB_22.mol -t 1hvr_XK2_1_A_263__C__.isdf.  As you see, in 1hvr_X2K_22_mcs.txt, several atoms are mapped to the same atoms.


 The version of the JChem giving the results is still 5.2.3.  I understand that there will be bug-fix for mcs in 5.3.  I hope that this finding will give the developers some hints.


  Best regards to you and the developers,


Tiqing

ChemAxon efa1591b5a

25-09-2009 13:20:04

Dear Tiqing,


Thank you for the thorough analysis and the detailed report, it's certainly great help for us in tracing the the bugs. We will check these problems and fix them ASAP. We'll inform you when fixes can be released.


Kind regards,


Miklos

ChemAxon efa1591b5a

31-03-2011 14:20:01













Posted: Fri Sep 25, 2009 1:20 pm    Post subject:




Dear Tiqing,


Please note that a new version of MCS search (actually, an MCES search) has been released. It should handle your test molecules better than the old version.


All feedback is welcome.


Regards


Miklos