Valence property fixer always reports false (no fix)

User 0261d34ad7

04-10-2012 09:09:06

Hi,


I think I've found a bug in the behaviour of the ValencePropertyChecker or RemoveValencePropertyFixer classes, would you mind confirming that this really is a bug? If not, any insight into the expected behaviour is appreciated.


We're using the StructureChecker class from Java, with the following XML configuration:



<checkers>


  <ValencePropertyChecker name="property"     fixMode="do_not_fix" fixerClassName="chemaxon.fixers.RemoveValencePropertyFixer" defaultValence="false" nonDefaultValence="true"/>


</checkers>


To invoke the fixer, we're using CheckerRunner to get a list of StructureCheckerResult objects, then calling fix(..) on the CheckerRunner with each result returned.


However we're observing that the call to fix always returns false for this particular configuration, even when the molecule has been changed. This is a problem because our code uses the boolean response as part of structure processing. As it stands, we can't tell if the molecule has changed or not...and whether the resulting molecule is good.


We've worked around this for now by ignoring the return value for this particular fixer, but it's not the preferred solution.


A couple of other things to mention:




It would also be greatly appreciated if you have any documentation on these structure checking components, and the correct way to use them to detect valence errors.


Any help appreciated,


Jim

ChemAxon f250711500

05-10-2012 13:44:48

Hi Jim,


Unfortunately this is a bug. RemoveValencePropertyFixer does not work as intended, it will be fixed in the upcoming version of MarvinBeans (5.11.2). As the other questions need more investigation, I will answer them next week.


Best regards,


Imre

User 0261d34ad7

06-10-2012 06:39:02

Great, thanks for the information, any more you can provide on the behaviour of this component would be awesome.


Thanks,


Jim

ChemAxon f250711500

08-10-2012 07:36:10

Hi Jim,


1. Collecting the result of the fixers
There are some structure fixers, that can not return whether the root case of the problem is solved, as it would require another checking mechanism. 


For example, there is a molecule that can not be drawn in 2D without overlapping bonds. In this case, the applied "clean" fixer modifies the structure; however, it can not fix the overlapping bond issue. The fixer can not determine whether the structure is fixed, it has information about its working performance only.


There can be another problem when a configuration contains more than one (checker,fixer) pairs, as a fixer can "ruin" the result created by a previously applied fixer.


These can be handled using the AdvancedCheckerRunner class (the GUI and Command Line interface of Structure Checker use this class too), that executes a configuration more than once if required, to get the "most proper" result for a molecule.


Another useful feature in AdvancedCheckerRunner is logging. We can use logging to create reports, or collect data on the checker/fixer mechanism. (We implemented CSV, HTML, etc. report generation, but you can implement your own.) Find the attached example code related to AdvancedCheckerRunner and logging. A logger attached to AdvancedCheckerRunner can provide all information on which errors of the structure were fixed, or which fixers were executed on a structure. The most useful part for determining whether an error has been fixed is methods onFirstCheck() and onLastCheck() in StructureCheckerLogger class. The result of onFirstCheck() contains errors found in the structure before the first execution of fix, and the result of onLastCheck() contains errors still existing after the last execution of fix.


2. Fixmode handling
If you use BasicCheckerRunner class, the value of fixMode (ask, fix, do_not_fix) is not used, AdvancedCheckerRunner has this feature only. In case of AdvancedCheckerRunner the checkAndWait() and check() methods of AdvancedCheckerRunner will not call any fixes, while the fix() method will apply fixes with fixMode=fix setting only (by default fixMode=ask). With this switch you can determine even when calling fix() method which type of errors should be corrected.


3. ValenceErrorChecker
The investigation about joined ValenceErrorChecker and ValencePropertyChecker behavior needs a bit more time. I will answer when I found the root cause.


Best Regards,
Imre

User 0261d34ad7

08-10-2012 08:15:17

That's great, thank you. The AdvancedCheckerRunner sounds perfect.

User 0261d34ad7

08-10-2012 14:45:57

I had a quick follow up question if you have a minute...


At the moment, we assume that a return value of "false" from a call to fix() means the chemical is bad, which as you've said may not be a valid assumption.


If it's not too much trouble, could you say whether any of the following configurations would hit the issue you've described? If so, it means we may be discarding good chemicals :(


  <AliasChecker fixMode="fix" fixerClassName="chemaxon.fixers.ConvertToAtomFixer"/>
  <AbbreviatedGroupChecker contracted="true" expanded="true" fixMode="fix" fixerClassName="chemaxon.fixers.UngroupFixer"/>
  <AromaticityErrorChecker fixMode="fix" fixerClassName="chemaxon.fixers.DearomatizeFixer" type="general"/>
  <CoordinationSystemErrorChecker fixMode="fix" fixerClassName="chemaxon.fixers.RemoveBondFixer"/>
  <CovalentCounterionChecker fixMode="fix" fixerClassName="chemaxon.fixers.CovalentCounterionFixer"/>
  <MoleculeChargeChecker fixMode="fix" fixerClassName="chemaxon.fixers.NeutralizeChargeFixer"/>


Just so you know, we create our CheckerRunner like this:


    ConfigurationReader cr = new XMLBasedConfigurationReader(checkerInpStream);        
    CheckerRunner crun = new BasicCheckerRunner(cr);

And we pass/fail chemicals like this:


List<StructureCheckerResult> results = mCRunner.checkAndWait();

 

for (StructureCheckerResult res : results) {
   boolean fixed = mCRunner.fix(res);
   if (fixed) {
      // Keep it, record fix...
   else {
      // Throw it away
   }
}

 


 

ChemAxon f250711500

09-10-2012 08:10:43

Hi Jim,


At first glance I suggest to switch the order of the first 2 checkers. Ungroup fixer should be the first applied, because fixers can not fix inside a contracted S-group. 
Reason: Checkers find the error inside an S-group, but we should not make automatic modifications, as it could ruin the meaning of the S-group. (To let the user control which groups can be fixed, and which can be not, expanded S-group means fixer can modify inside the group, contracted S-group means the fixer can not modify inside the S-group)


In this case if you have an alias inside a contracted S-group, it will be discarded. If you switch the AbbreviatedGroupChecker and the AliasChecker, first groups will be ungrouped, and then the aliases contained by groups will be fixed.


The second issue is related to CovalentCounterIon fixer. CovalentCounterIonFixer adds charge to the molecule, that can be removed by the NeutralizeFixer afterwards, if the total charge of the molecule is not 0.
I would recommend to switch the order of the Neutralize fixer and the CovalentCounterIonFixer, as it can change the result quite a bit when an atom modified by the covalent counterion fixer is connected to a charged atom (e.g.: C[NH2+]O[Na]). It is up to you to decide which is the expected result. In both cases the molecule will be accepted, however they will be different.


 


Best regards,
Imre

User 0261d34ad7

09-10-2012 08:49:52

Brilliant, thank you very much for the feedback.


Jim

User 0261d34ad7

15-10-2012 14:39:07

Hi,


Following up on the original issue, we've created a SMARTS / substructure based configuration as a way of capturing all valence issues. We were wondering if it was possible to get a little feedback? I can send a copy via email if necessary.


From testing, our SMARTS based config captures all the valence issues we've identified, but certain types of bad valence issue are still slipping through. For example:



CC(C)[NH]1(CCN=C1)(OC(=O)C(=O)O[NH]1(CCN=C1)(C(C)C)C(C)C)C(C)C



Our chemical specialist identified the above example, and is investigating as we speak, but we thought a forum question wouldn't hurt. Specifically, it would be good to know if this kind of issue should really be captured by one of the built in standardizers / fixers...


Jim

ChemAxon afdac7b783

27-10-2012 07:48:25


Hi Jim,


Sorry for the late answer.


Considering the attached molecule and the behavior of
valence property-valence error checkers on it:


 


Valence Error Checker "accepts" manually set
valence properties; it will not mark atoms with manually set but erroneous
valence properties.


If you remove valence property first using valence property
checker-remove valence property, then the applied valence error checker will
catch the six valenced nitrogen in the attached structure.


- The default valences of nitrogen are 3 and 5. Atoms with valences
other than "normal" are described in brackets.-


 


Best regards,


 


Viktoria