order of operations

User 870ab5b546

02-05-2013 02:18:30

I am normalizing all coordinate bonds A -> B to A+–B before comparing two structures.  


target: P[Pd](P)P |C:0.0,2.1,3.2|
query: [H]P([H])([H])[Pd](P([H])([H])[H])P([H])([H])[H] |C:1.3,5.4,9.8|

The MolSearch is FULL, with charge, radical, and isotope matching set to exact and stereo search type set to STEREO_SPECIFIC.  


When my normalization method is the following, JChem returns false for search.isMatching().


    public static void normalizeCoordinateBondsNoClone(Molecule mol) {
final String SELF = "Normalize.normalizeCoordinateBondsNoClone: ";
for (MolBond bond : mol.getBondArray()) {
if (bond.getType() == MolBond.COORDINATE) {
final MolAtom atom1 = bond.getAtom1();
if (!ChemUtils.isMulticenterAtom(atom1)) {
final MolAtom atom2 = bond.getAtom2();
bond.setType(1);
atom1.setCharge(atom1.getCharge() + 1);
atom2.setCharge(atom2.getCharge() - 1);
} // if bond involves multiatom group
} // if bond is coordinate
} // for each bond
} // normalizeCoordinateBondsNoClone(Molecule)

Here are the MRVs of the Molecules after bond normalization in this case:


MolCompare.matchExact: target:
<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4" elementType="P Pd P P" formalCharge="1 -3 1 1" x2="2.3099999999999996 1.54 2.3100000000000005 0.0" y2="1.3336791218280357 0.0 -1.3336791218280353 0.0"/>
<bondArray>
<bond atomRefs2="a1 a2" order="1"/>
<bond atomRefs2="a3 a2" order="1"/>
<bond atomRefs2="a4 a2" order="1"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

query:
<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13" elementType="H P H H Pd P H H H P H H H" formalCharge="0 1 0 0 -3 1 0 0 0 1 0 0 0" x2="3.6436791218280353 2.3099999999999996 3.079999999999999 0.9763208781719639 1.54 2.3100000000000005 0.9763208781719652 3.080000000000001 3.643679121828036 0.0 9.42978035343462E-17 -1.54 -2.828934106030386E-16" y2="0.563679121828036 1.3336791218280357 2.6673582436560714 2.1036791218280357 0.0 -1.3336791218280353 -2.1036791218280357 -2.6673582436560705 -0.563679121828035 0.0 1.54 1.885956070686924E-16 -1.54"/>
<bondArray>
<bond atomRefs2="a1 a2" order="1"/>
<bond atomRefs2="a2 a3" order="1"/>
<bond atomRefs2="a2 a4" order="1"/>
<bond atomRefs2="a2 a5" order="1"/>
<bond atomRefs2="a6 a5" order="1"/>
<bond atomRefs2="a6 a7" order="1"/>
<bond atomRefs2="a6 a8" order="1"/>
<bond atomRefs2="a6 a9" order="1"/>
<bond atomRefs2="a10 a5" order="1"/>
<bond atomRefs2="a10 a11" order="1"/>
<bond atomRefs2="a10 a12" order="1"/>
<bond atomRefs2="a10 a13" order="1"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

However, if I move the command bond.setType(1) to after setting the atom charges, JChem returns true.  Here are the MRVs of the Molecules after bond normalization in this case:


MolCompare.matchExact: target:
<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4" elementType="P Pd P P" formalCharge="1 -3 1 1" x2="2.3099999999999996 1.54 2.3100000000000005 0.0" y2="1.3336791218280357 0.0 -1.3336791218280353 0.0"/>
<bondArray>
<bond atomRefs2="a1 a2" order="1"/>
<bond atomRefs2="a3 a2" order="1"/>
<bond atomRefs2="a4 a2" order="1"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

query:
<?xml version="1.0"?>
<cml version="ChemAxon file format v5.10.0, generated by v5.11.5">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13" elementType="H P H H Pd P H H H P H H H" formalCharge="0 1 0 0 -3 1 0 0 0 1 0 0 0" x2="3.6436791218280353 2.3099999999999996 3.079999999999999 0.9763208781719639 1.54 2.3100000000000005 0.9763208781719652 3.080000000000001 3.643679121828036 0.0 9.42978035343462E-17 -1.54 -2.828934106030386E-16" y2="0.563679121828036 1.3336791218280357 2.6673582436560714 2.1036791218280357 0.0 -1.3336791218280353 -2.1036791218280357 -2.6673582436560705 -0.563679121828035 0.0 1.54 1.885956070686924E-16 -1.54"/>
<bondArray>
<bond atomRefs2="a1 a2" order="1"/>
<bond atomRefs2="a2 a3" order="1"/>
<bond atomRefs2="a2 a4" order="1"/>
<bond atomRefs2="a2 a5" order="1"/>
<bond atomRefs2="a6 a5" order="1"/>
<bond atomRefs2="a6 a7" order="1"/>
<bond atomRefs2="a6 a8" order="1"/>
<bond atomRefs2="a6 a9" order="1"/>
<bond atomRefs2="a10 a5" order="1"/>
<bond atomRefs2="a10 a11" order="1"/>
<bond atomRefs2="a10 a12" order="1"/>
<bond atomRefs2="a10 a13" order="1"/>
</bondArray>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>

I don't see any difference in the MRVs in the two cases, yet in one case they match, and the other, they don't.  Why would setting the bond type before or after setting the atom charges make a difference in whether the structures match???

ChemAxon 9991eff751

03-05-2013 09:38:30

Hello,


The normalization became sensitive to the order of the operations because MolBond.setType() triggers an implicit valenceCheck() on the Molecule, however Molecule.setCharge() doesn't.


The difference doesn't show up in the exported files, because the exporter also ensures that the Molecule has a correct valence before exporting - and this way it hides this error.


The core team is planning to make these methods more consistent by removing these implicit  valenceChecks() - the reason for this is that when a molecule is built up piece by piece, these valenceCheck()-s are unnecessary.


our molecule manipulation expert's recommendation: after finishing molecule alteration, a call to Molecule.valenceCheck() is recommended


best regards,


Zoltan

User 870ab5b546

03-05-2013 13:14:06

Wow!  That is subtle.  So if I change the bond from coordinate to single, the implicit H counts change, and then when I set the atom charges, the implicit H counts fail to change back to what they were originally.  


There are so many places in our code that we manipulate molecules that I am afraid that instituting Molecule.valenceCheck() after every manipulation will introduce new bugs.  Maybe the better solution is for JChem to run MolAtom.valenceCheck() on an atom after MolAtom.setCharge().

ChemAxon 25dcd765a3

06-05-2013 12:54:37










bobgr wrote:

Wow!  That is subtle.  So if I change the bond from coordinate to single, the implicit H counts change, and then when I set the atom charges, the implicit H counts fail to change back to what they were originally.  


There are so many places in our code that we manipulate molecules that I am afraid that instituting Molecule.valenceCheck() after every manipulation will introduce new bugs.  Maybe the better solution is for JChem to run MolAtom.valenceCheck() on an atom after MolAtom.setCharge().



There are functions which are running ValenceCheck during their operation. In case of setType function this is explicitly mentioned in the APIdoc. There are other functions which does not call ValenceCheck after their operation, in this case it is not mentioned in the APIdoc. To call valenceCheck after every operation in the molecule is time consuming. So the current idea is to manipulate the molecule and if the molecule is ready, call valenceCheck.


On the other hand we can provide high level functions to manipulate the molecule which would call valenceCheck after every operation, if it suits your need.


 


Let me know your opinion.

User 870ab5b546

06-05-2013 13:43:16

I guess I don't understand the rationale behind running valenceCheck() during MolBond.setType() but not during MolAtom.setCharge().  In both cases, the number of implicit H atoms is likely to change.  It seems to me that JChem should either run valenceCheck() automatically every time it does setType(, setCharge(), setRadical(), etc., or it should always leave it up to the user to run valenceCheck().  I think the former approach is more intuitive.  


An alternative is to clear the cached implicit H count when you do setType(), setCharge(), etc., and then recalculate it only if and when a function is called that requires it.  


At least, add to the API documentation for setCharge() and setRadical() that valenceCheck() is required afterwards for an accurate implicitHCount().

ChemAxon 25dcd765a3

06-05-2013 14:32:04

Maybe it is more intuitive, but slow in many cases. Actually it is not about specific methods, but we need methods which does not call valenceCheck to build structures quickly. To set a bond type it is possible during construction and even with some other method which does not call valenceCheck. But for other setters we have the problem that we cannot set them in the constructor (like setCharge, setRadical). So the reason is to build molecules quickly, without calling valenceCheck. 


An alternative is to clear the cached implicit H count when you do setType(), setCharge(), etc., and then recalculate it only if and when a function is called that requires it.   

Instead of this, we don't even clear the cached implicit H count.


We keep our current public functions as they are for backward compatibility reasons, but we are open to add new functions. Eg the current getImplicitHcount() function definitely states that ValenceCheck call is needed to update the information so we keep this method for backward compatibility.

User 870ab5b546

06-05-2013 14:43:19

OK.  Does removeBond() trigger valenceCheck() for the atoms that were connected by the bond being removed?  Does removeAtom() trigger valenceCheck() for the atoms that were connected to the atom being removed?  The API doesn't say.  If the API doesn't say, can I safely assume that I need to run valenceCheck()?

ChemAxon 25dcd765a3

06-05-2013 14:57:23










bobgr wrote:

OK.  Does removeBond() trigger valenceCheck() for the atoms that were connected by the bond being removed?  Does removeAtom() trigger valenceCheck() for the atoms that were connected to the atom being removed?  The API doesn't say. 



My bad. We will correct the APIDoc ASAP and thank you for pointing this out.


If the API doesn't say, can I safely assume that I need to run valenceCheck()? 

Yes.

User 870ab5b546

06-05-2013 15:11:49










volfi wrote:

An alternative is to clear the cached implicit H count when you do setType(), setCharge(), etc., and then recalculate it only if and when a function is called that requires it.   

Instead of this, we don't even clear the cached implicit H count.



Yeah, that's what causes the problem.  Instead, take this approach: Clear the cached implicit H count after doing an operation such as setCharge(), setRadical(), or setType().  Then later, if the code calls a method that needs the implicit H count, JChem will realize that it needs to recalculate and cache it.  Et voilà! no more need for the coder to know which methods call valenceCheck() and which don't, and getImplicitHCount() always returns an accurate value, regardless of what the user has done before.  I don't think clearing the cached implicit H count should be time-consuming, should it?

ChemAxon 25dcd765a3

08-05-2013 06:31:10

I think this is a good idea, we have to check how does this fits to our representation and do plenty of speedtest not to slow down at any level, but generally I agree with it. Thank you for your comment.

ChemAxon d26931946c

16-05-2013 16:34:23


Hi Bob, 


 


We started to work on this issue but we can't change the behavior of the getImplicitHCount() method of MolAtom because many part of our code depends on it's current behavior and probably this is the case for some of our users.


We will introduce a new method that calls valence check if there was any modification on the atom that made it necessary since the last run of it.


Is this solution acceptable for you? Can you suggest a name for this method?


 


BR,


Peter


User 870ab5b546

16-05-2013 16:38:42

I would just add a method with the same name but a new parameter, boolean runValenceCheckFirst, that is false by default (when omitted).  

ChemAxon 2cd598e7ad

13-06-2013 08:51:27

Hi,

From 6.0.1. a new method will be available: getImlicitHCount(boolean). This method performs valence checking if it is called with true parameter (and if valence related change has happened).

Regards,
Domi