Bug in MCS computation?

User 820e1cd6b2

20-09-2012 16:17:28

Hi,

I found some weird behavior in the MCES method that I don't understand. Here my code:

MCES mcs = new MCES();
mcs.setMolecules(mol,mol2);
mcs.setMinComponentSize(1);
mcs.setKeepLargestComponent(true);
mcs.setSearchMode(MCES.SearchMode.EXHAUSTIVE);
mcs.search();
Molecule mcsMol = mcs.getAsMolecule();    

Here are some examples for which in my opinion the result of the MCS computation differs from the correct one:
a) N=C(Nc1ccccc1)Nc1ccccc1 and CN(C)c1ccccc1: MCS = c1ccccc1. In my opinion the result should be Nc1ccccc1.
b) N=C(Nc1ccccc1)Nc1ccccc1 and CN(C)c1ccccc1: [c]1ccccc1. In my opinion the result should be CNc1ccccc1.
c) CCCCCCCCCc1ccc(O)cc1 and OC(c1ccc(Cl)cc1)(c1ccc(Cl)cc1)C(Cl)(Cl)Cl: MCS = c1ccccc1. In my opinion the result should be c1ccccc1CC.
d) NCCCCCCCCCCC(O)=O and CC(=C)C1CCC(C)=CC1: MCS = CCCCC. In my opinion the result should be CCCCCCC.

Is this a bug or is there something wrong in my understanding of the procedure?

Thanks and best regards

ChemAxon 4a2fc68cd1

20-09-2012 20:29:05

Hi,


This is a known deficiency of our current MCES implementation. Your expectations are correct, but this implementation can fail to find the optimal connected MCES.


This deficiency is due to the inherent behavior of the currently applied algorithm, which basically tries to find the optimal disconnected MCES. In case of setKeepLargestComponent(true), we simply skip all components but the largest one. Unfortunatelly, the largest component of the largest MCES is not the same as the largest connected MCES, which you are looking for.


For example, in case of the d) input, the optimal MCES is "CCCCC.CCCC", which is correctly found by the algorithm. Its largest component is CCCCC, while the largest connected MCES is CCCCCCC (as you wrote).


We are going to provide a better solution for finding connected MCES in an upcoming major release.


Best regards,
Peter

User 820e1cd6b2

21-09-2012 06:24:28

Hi Peter,


thank you for the clarification and the fast repsonse! Do you already know when the next major release will be approximately?


Best regards

ChemAxon 4a2fc68cd1

21-09-2012 16:14:26

Hi,


Our next major release, 5.11 will be released soon (most likely next week), but it won't contain improvements related to MCES. Such improvements are expected to be released later, but the actual schedule is not determined yet. (We usually release a major version approx. every three months).


Best regards,
Peter

ChemAxon 4a2fc68cd1

03-06-2013 11:23:52

Hi,


I am pleased to report you that JChem 6.0 introduces a brand
new MCS
search
engine. It is based on a much more efficient algorithm,
which incorporates many heuristics for improving both the running time and the results. The improvements are especially substantial in case of connected MCS search that you investigated. (Note: "connected mode" option replaces the suboptimal setKeepLargestComponent() option of the old MCES class).


The new MCS algorithm gives optimal results for all of your examples.


Best regards,
Peter