I found some weird behavior in the MCS that I don't understand. Here my code:
mol1 = MolImporter.importMol("Nc1ccc(N)c(c1)[N+]([O-])=O");
mol2 = MolImporter.importMol("Oc1c(Cl)cc(Cl)cc1Cl");
MCS mcs = new MCS();
The result of the MCS computation is c:c:c:c:c. In my opiniion the result should be c1ccccc1.
Here is another example of two molecules which produces the same result (c:c:c:c:c), which in my opinion should by c1ccccc1:
Is this a bug or is there something wrong in my understanding of the procedure?
Your observation is correct, the found substructure is not an MCS. This, however, is more like a feature of the mcs search, as it uses a heuristic algorithm that does not guarantee that the optimal solution is found.
We have implemented a new MCES/MOS search recently. It'll be released in 5.4. This is able to find the optimal solution in your particular case (and in most cases), though this is also a heuristic search, but more robust. A beta release will soon be available for download, I'll inform you here in the forum.
Apologies for any inconvenience.
I think it's as important to have an option for generating all maximal common subgraphs as it's for generating the maximum common subgraph. It shouldn't be too hard to imagine where the former can be useful. Here is one example.
thanks for the wise thoughts, such option will be introduced in the next minor release (we also need that for internal purposes). The fragment based MCS approach is something new, we'll look at the recommended page.
Are you open to a comparative evaluation of our new MCS algorithm?
Sure. I've been waiting for an exact MCS in JChem for a long time (despite being told otherwise ). I'll be happy to help in anyway I can. I can even share with you the code if it will help.
The new MCES search has been released in JChem 5.4. It is not an exact search as we do not really need it for our internal use cases. However, we are continuously improving the search algorithm to better approximate the optimum solution. I admit that there are cases when the exact MCS is needed but it is not our main goal now to find an exact MCES, instead we would like to find one very near to the 'real one' but much faster than an exact search works.
If you can give it a try and provide us some feedback (any kind of feedback) that would help us a lot in improving the new search.
If you run the mcs command in jchem then it starts the new search method. The simple command-line application has some simple yet useful new features, it can compare two sets of structures against each other and it provides two ways to visualise the results, either in a grid view or in a table of pairs. The command without arguments list all options and also prints some usage examples.
Thank you for offering your source code but I'm afraid we cannot make direct use of it. However, comparing your search and any other third party search against each other and ours, of course, would be very interesting for us.
Can we perhaps collaborate in this?