MoleculeGraph.fuse() bug with multicenter groups?

User 870ab5b546

01-05-2013 20:45:50

The code:


    Molecule getFusedMolecule() {
final String SELF = "MechStage.getFusedMolecule: ";
final Molecule fMol = new Molecule();
debugPrintMRV(SELF + "stage compounds before fusing:\n", molecules);
for (Molecule mol : molecules) fMol.fuse(mol);
debugPrintMRV(SELF + "stage compounds after fusing:\n", molecules);
debugPrintMRV(SELF + "fused molecule:\n", fMol);
return fMol;
} // getFusedMolecule()

The log output:


MechStage.getFusedMolecule: stage compounds before fusing:
[<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5 a6 a7 a8" elementType="C C C C C X Fe Cl" sgroupRef="sg1 sg1 sg1 sg1 sg1 0 0 0" x2="-8.56624984741211 -9.812116176429136 -9.336253759041917 -7.796245935782302 -7.320383518395081 -8.56624984741211 -5.678750038146973 -4.138750038146973" y2="4.293742359681733 3.3885257611697317 1.9239783212498387 1.9239783212498387 3.3885257611697317 2.983750104904175 3.128124952316284 3.128124952316284"/>
<bondArray>
<bond atomRefs2="a1 a2" order="A"/>
<bond atomRefs2="a1 a5" order="A"/>
<bond atomRefs2="a2 a3" order="A"/>
<bond atomRefs2="a3 a4" order="A"/>
<bond atomRefs2="a4 a5" order="A"/>
<bond atomRefs2="a6 a7" convention="cxn:coord"/>
<bond atomRefs2="a7 a8" order="1"/>
</bondArray>
<molecule id="sg1" role="MulticenterSgroup" molID="m2" atomRefs="a1 a2 a3 a4 a5" center="a6"/>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>
.<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5" elementType="C C C C C" formalCharge="0 0 -1 0 0" x2="-1.6702696059563622 -2.867095630446218 -4.158577570022041 -3.759994215755578 -2.2221877632430775" y2="0.4353009718913631 1.4044281395550353 0.5657275709329622 -0.9218057582407091 -1.0024007870479663"/>
<bondArray>
<bond atomRefs2="a1 a2" order="2"/>
<bond atomRefs2="a1 a5" order="1"/>
<bond atomRefs2="a2 a3" order="1"/>
<bond atomRefs2="a3 a4" order="1"/>
<bond atomRefs2="a4 a5" order="2"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>
]
MechStage.getFusedMolecule: stage compounds after fusing:
[<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5 a6 a7 a8" elementType="C C C C C X Fe Cl" x2="-8.56624984741211 -9.812116176429136 -9.336253759041917 -7.796245935782302 -7.320383518395081 -8.56624984741211 -5.678750038146973 -4.138750038146973" y2="4.293742359681733 3.3885257611697317 1.9239783212498387 1.9239783212498387 3.3885257611697317 2.983750104904175 3.128124952316284 3.128124952316284"/>
<bondArray>
<bond atomRefs2="a1 a2" order="A"/>
<bond atomRefs2="a1 a5" order="A"/>
<bond atomRefs2="a2 a3" order="A"/>
<bond atomRefs2="a3 a4" order="A"/>
<bond atomRefs2="a4 a5" order="A"/>
<bond atomRefs2="a6 a7" convention="cxn:coord"/>
<bond atomRefs2="a7 a8" order="1"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>
.<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5" elementType="C C C C C" formalCharge="0 0 -1 0 0" x2="-1.6702696059563622 -2.867095630446218 -4.158577570022041 -3.759994215755578 -2.2221877632430775" y2="0.4353009718913631 1.4044281395550353 0.5657275709329622 -0.9218057582407091 -1.0024007870479663"/>
<bondArray>
<bond atomRefs2="a1 a2" order="2"/>
<bond atomRefs2="a1 a5" order="1"/>
<bond atomRefs2="a2 a3" order="1"/>
<bond atomRefs2="a3 a4" order="1"/>
<bond atomRefs2="a4 a5" order="2"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>
]
MechStage.getFusedMolecule: fused molecule:
<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13" elementType="C C C C C X Fe Cl C C C C C" formalCharge="0 0 0 0 0 0 0 0 0 0 -1 0 0" sgroupRef="sg1 sg1 sg1 sg1 sg1 0 0 0 0 0 0 0 0" x2="-8.56624984741211 -9.812116176429136 -9.336253759041917 -7.796245935782302 -7.320383518395081 -8.56624984741211 -5.678750038146973 -4.138750038146973 -1.6702696059563622 -2.867095630446218 -4.158577570022041 -3.759994215755578 -2.2221877632430775" y2="4.293742359681733 3.3885257611697317 1.9239783212498387 1.9239783212498387 3.3885257611697317 2.983750104904175 3.128124952316284 3.128124952316284 0.4353009718913631 1.4044281395550353 0.5657275709329622 -0.9218057582407091 -1.0024007870479663"/>
<bondArray>
<bond atomRefs2="a1 a2" order="A"/>
<bond atomRefs2="a1 a5" order="A"/>
<bond atomRefs2="a2 a3" order="A"/>
<bond atomRefs2="a3 a4" order="A"/>
<bond atomRefs2="a4 a5" order="A"/>
<bond atomRefs2="a6 a7" convention="cxn:coord"/>
<bond atomRefs2="a7 a8" order="1"/>
<bond atomRefs2="a9 a10" order="2"/>
<bond atomRefs2="a9 a13" order="1"/>
<bond atomRefs2="a10 a11" order="1"/>
<bond atomRefs2="a11 a12" order="1"/>
<bond atomRefs2="a12 a13" order="2"/>
</bondArray>
<molecule id="sg1" role="MulticenterSgroup" molID="m2" atomRefs="a1 a2 a3 a4 a5" center="a6"/>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

Note that the MulticenterSgroup definition has *disappeared* from the first stage compound after it has been fused into the fused molecule.  I'm pretty sure this is a bug, as it leads to inconsistencies later (null Sgroups).  I didn't have any problem with getFusedMolecule() before I started playing around with multiatom groups.

ChemAxon e49cf225c6

02-05-2013 13:34:35

The same thing happens with other kinds of sgroups too. However I don't think it is a bug.


The fuse() method creates inconsistency for regular MoleculeGraphs anyway. The atoms and bonds of the fused-in structure will be contained in 2 molecules after the fuse. (In that regard I think we should remove even the atoms and bonds too).


If one would like to preserve the original molecules, they should be cloned before fusing.


Maybe we should make it clear in the documentation.

User 870ab5b546

02-05-2013 13:46:53

I understand that the MolAtoms and MolBonds are in two different graphs.  But if they are in two different graphs, the Sgroups must also be in two different graphs.  


Please do not change fuse() so that the fused molecule contains copies of the original molecule's atoms and bonds. We use fuse() to combine a Molecule[] and some associated MEFlows into a single MDocument.  We need the original MolAtoms of the Molecule[] to be placed in the fused Molecule so that the MEFlow references remain consistent.  In other words, fuse() should be the opposite of convertToFrags().


Perhaps the best way to handle it is to have fuse() put the original objects into the fused Molecule and replace the objects in the original Molecules with copies.  But you would definitely have to document that approach in the API.

ChemAxon e49cf225c6

03-05-2013 09:05:57

After some discussion we decided not to encourage the use of the original molecules (which were fused in). We still think that the best solution would be to remove everything from the original structures.


fuse()  should care only about, that the fused mocule is structurally consistent. This seems to be working so far, and won't be changed.


To retain the original molecules, they shall be cloned before being fused.


 


You're right, that "But if they are in two different graphs, the Sgroups must also be in two different graphs", but we would solve that by removing the bonds and atoms too.

User 870ab5b546

03-05-2013 12:19:00










dzatonyi wrote:

After some discussion we decided not to encourage the use of the original molecules (which were fused in). We still think that the best solution would be to remove everything from the original structures.


fuse()  should care only about, that the fused mocule is structurally consistent. This seems to be working so far, and won't be changed.


To retain the original molecules, they shall be cloned before being fused.


You're right, that "But if they are in two different graphs, the Sgroups must also be in two different graphs", but we would solve that by removing the bonds and atoms too.



I can live with you removing everything from the original Molecules.  That behavior would also correspond to what happens in convertToFrags().  


When you say, "To retain the original molecules, they shall be cloned before being fused," I assume you mean that the programmer should clone them before fusing them, and not that JChem will clone them before fusing them?

ChemAxon e49cf225c6

03-05-2013 12:34:23










bobgr wrote:










dzatonyi wrote:

After some discussion we decided not to encourage the use of the original molecules (which were fused in). We still think that the best solution would be to remove everything from the original structures.


fuse()  should care only about, that the fused mocule is structurally consistent. This seems to be working so far, and won't be changed.


To retain the original molecules, they shall be cloned before being fused.


You're right, that "But if they are in two different graphs, the Sgroups must also be in two different graphs", but we would solve that by removing the bonds and atoms too.



I can live with you removing everything from the original Molecules.  That behavior would also correspond to what happens in convertToFrags().


When you say, "To retain the original molecules, they shall be cloned before being fused," I assume you mean that the programmer should clone them before fusing them, and not that JChem will clone them before fusing them?



Yes, the programmer should care about this.