question about MolSearch behavior

User 870ab5b546

02-01-2009 15:43:03

The MolSearch invocation:


Code:
MolSearch search = new MolSearch();


MolSearchOptions searchOpts = new MolSearchOptions();


searchOpts.setSearchType(SearchConstants.EXACT);


searchOpts.setExactBondMatching(false);


searchOpts.setChargeMatching(SearchConstants.CHARGE_MATCHING_IGNORE);


searchOpts.setRadicalMatching(SearchConstants.RADICAL_MATCHING_IGNORE);


searchOpts.setIsotopeMatching(SearchConstants.ISOTOPE_MATCHING_EXACT);


searchOpts.setStereoSearchType(SearchConstants.STEREO_SPECIFIC);


searchOpts.setStereoModel(SearchConstants.STEREO_MODEL_GLOBAL);


search.setSearchOptions(searchOpts);


search.setTarget(authMol);


search.setQuery(respMol);


boolean retValue = search.isMatching();






The query:


Code:
<?xml version="1.0" ?>


<cml>


<MDocument>


  <MChemicalStruct>


    <molecule molID="m1">


      <atomArray


          atomID="a1 a2 a3 a4 a5 a6 a7"


          elementType="C C O C C C C"


          mrvPseudo="0 0 0 PSEUDO_H PSEUDO_H PSEUDO_H PSEUDO_H"


          x2="-3.3687500953674316 -2.598750095367431 -1.0587500953674311 -3.3687500953674308 -2.035070973539396 -4.138750095367432 -4.702429217195467"


          y2="1.9731249809265137 0.6394458590984782 0.6394458590984782 -0.6942332627295575 2.743124980926514 3.3068041027545494 1.2031249809265134"


          />


      <bondArray>


        <bond atomRefs2="a1 a2" order="1" />


        <bond atomRefs2="a2 a3" order="1" />


        <bond atomRefs2="a2 a4" order="1" />


        <bond atomRefs2="a1 a5" order="1" />


        <bond atomRefs2="a1 a6" order="1" />


        <bond atomRefs2="a1 a7" order="1" />


      </bondArray>


    </molecule>


  </MChemicalStruct>


</MDocument>


</cml>






The target:


Code:
<?xml version="1.0" ?>


<cml>


<MDocument>


  <MChemicalStruct>


    <molecule molID="m1">


      <atomArray


          atomID="a1 a2 a3 a4 a5 a6 a7"


          elementType="C C O C C C C"


          formalCharge="0 1 -1 0 0 0 0"


          mrvPseudo="0 0 0 PSEUDO_H PSEUDO_H PSEUDO_H PSEUDO_H"


          x2="-2.983750104904175 -2.2137501049041743 -0.6737501049041743 -2.983750104904174 -1.6500709830761393 -3.7537501049041753 -4.3174292267322105"


          y2="1.6362500190734863 0.30257089724545083 0.30257089724545083 -1.0311082245825849 2.406250019073487 2.969929140901522 0.8662500190734861"


          />


      <bondArray>


        <bond atomRefs2="a1 a2" order="1" />


        <bond atomRefs2="a2 a3" order="1" />


        <bond atomRefs2="a2 a4" order="1" />


        <bond atomRefs2="a1 a5" order="1" />


        <bond atomRefs2="a1 a6" order="1" />


        <bond atomRefs2="a1 a7" order="1" />


      </bondArray>


    </molecule>


  </MChemicalStruct>


</MDocument>


</cml>






This is actually the behavior I want, but I didn't expect it. What is curious about it is that the algorithm is returning true, even though the query has two implicit H atoms that are not present in the target. When the charge and radical flags are set to "ignore", are implicit H atoms ignored also? If so, you should say so in the API and the Query Guide.

ChemAxon 990acf0dec

02-01-2009 19:24:39

Hi Bob,





I've moved this topic to the appropriate forum.





Best regards,





Akos

ChemAxon 42004978e8

06-01-2009 11:02:32

Hi Bob,





Implicit H matching isn't directly dependent on charge matching.


The number of implicit hydrogens doesn't need to be equal in exact search. Exact search checks the matching of heavy atoms.


You can see examples here:


http://www.chemaxon.com/jchem/doc/user/query_searchtypes.html#exactsub


You may have a look on Table 1., the 2nd query.





Charge matching controls the following:


default value: non charged q can match on charged t as well, but charged q can match only on charged t (with the same charge).


for other values:


http://www.chemaxon.com/jchem/doc/user/query_features.html#iso_charge_rad





In your case the query matches the target even if you leave all matching option on their default value. (implicit H non-equivalence doesn't hinder it, and the non.charged q can match on the charged t) However in the opposite direction your charged target will match the query only if charge matching is set to ignore.





Best regards,


Robert

User 870ab5b546

06-01-2009 15:12:04

If exact matching checks only heavy atoms, then why doesn't query CCCCCO match to target C1CCOCC1? Does it also check bonds between heavy atoms?





Code:



searchType = exact


setStereoSearch = true


exactStereoMatching = false


doubleBondStereoMatching = false


stereoMatchingModel = false





MolSearch ourSearch = new MolSearch();


MolSearchOptions ourSearchOpts = new MolSearchOptions();


Molecule targetMol = MolImporter.importMol(target);


Molecule queryMol = MolImporter.importMol(query);


ourSearch.setTarget(targetMol);


ourSearch.setQuery(queryMol);


ourSearchOpts.setSearchType(searchType);


ourSearchOpts.setOrderSensitiveSearch(orderSensitive);


ourSearchOpts.setStereoModel(stereoMatchingModel);


if (searchType != SearchConstants.PERFECT) {


   ourSearchOpts.setStereoSearch(setStereoSearch);


   ourSearchOpts.setExactStereoMatching(exactStereoMatching);


   ourSearchOpts.setDoubleBondStereoMatchingMode(doubleBondStereoMatching);


   ourSearchOpts.setChargeMatching(SearchConstants.CHARGE_MATCHING_EXACT);


   ourSearchOpts.setIsotopeMatching(SearchConstants.ISOTOPE_MATCHING_EXACT);


   ourSearchOpts.setRadicalMatching(SearchConstants.RADICAL_MATCHING_EXACT);


   ourSearchOpts.setVagueBondLevel(SearchConstants.VAGUE_BOND_OFF);


}


ourSearch.setSearchOptions(ourSearchOpts);


searchResult = ourSearch.isMatching();


ChemAxon a9ded07333

06-01-2009 17:17:40

Quote:
Does it also check bonds between heavy atoms?
Yes, it does.





Regards,


Tamás

User 870ab5b546

06-01-2009 17:33:44

Ahh.... So, exact (and perfect) search checks the matching of heavy atoms and the bonds between them.





It follows, then, that if charge, radical, and valence matching are all set to exact or true, then explicit H atoms in the query must be present (implicit or explicit) in the target (assuming no valence errors like pentavalent C), and implicit H atoms in the query may or may not be present (implicit or explicit) in the target. Otherwise, both explicit and implicit H atoms in the query may be absent in the target, unless setImplicitHMatching() is set to IMPLICIT_H_MATCHING_DISABLED, in which case an explicit H atom in the query must be present and explicit in the target. Correct?

ChemAxon 42004978e8

08-01-2009 14:31:18

Hi Bob,





Yes, as you wrote.





However if you set the mentioned options to exact and carry out an exact search, then this will imply that a query implicit hydrogen is present in the target.





Bye,


Robert

User 870ab5b546

08-01-2009 14:45:42

Right.

ChemAxon 42004978e8

08-01-2009 14:55:18

Hi Bob,





One other correction I didn't make earlier:





Explicit hydrogens in the query should always be present in the target.


setImplicitHMatching() only regulates if they can match on target implicit hydrogens or only on explicit hydrogens.





Implicit hydrogens may be absent in the target.





Bye,


Robert

User 870ab5b546

08-01-2009 16:09:31

OK, one more time, with feeling.





Explicit H atoms in the query must always be present in the target. The target H atoms may normally be implicit or explicit, but if setImplicitHMatching() is set to IMPLICIT_H_MATCHING_DISABLED, they must be explicit.





For implicit H atoms in the query, if charge, radical, and valence matching all set to exact or true, then H atoms must be present (implicit or explicit) in the target (assuming no valence errors like pentavalent C). However, if charge, radical, and valence matching not all set to exact or true, then implicit H atoms in the query may be absent in the target.





Perfect searching sets charge, radical, and valence matching to exact or true. Exact and substructure searching sets charge and radical matching to default.





Right?

ChemAxon 42004978e8

09-01-2009 15:09:17

bobgr wrote:
OK, one more time, with feeling.





Explicit H atoms in the query must always be present in the target. The target H atoms may normally be implicit or explicit, but if setImplicitHMatching() is set to IMPLICIT_H_MATCHING_DISABLED, they must be explicit.


This parts is OK


.
bobgr wrote:



For implicit H atoms in the query, if charge, radical, and valence matching all set to exact or true, then H atoms must be present (implicit or explicit) in the target (assuming no valence errors like pentavalent C). However, if charge, radical, and valence matching not all set to exact or true, then implicit H atoms in the query may be absent in the target.


These exact/true options doesn't set H matching directly. The H atoms of the query must be present in the target when these exact options are set AND an exact search (or perfect) is executed. In this case the query implicit hydrogens should be present in the target, they can't be omitted due to charge/radical difference.


However in case of substructure search instead of the given implicit hydrogen there can be a heavy atom connection and exact charge/radical matching is still fulfilled.


e.g. C match on CO





Quote:



Perfect searching sets charge, radical, and valence matching to exact or true. Exact and substructure searching sets charge and radical matching to default.





Right?
Yes.

User 870ab5b546

09-01-2009 15:31:27

Yes, of course. OK, once again.





In exact, perfect, and substructure searches, explicit H atoms in the query must always be present in the target. The target H atoms may normally be implicit or explicit, but if setImplicitHMatching() is set to IMPLICIT_H_MATCHING_DISABLED, they must be explicit.





In exact and perfect searches, if charge, radical, and valence matching all set to exact or true, then implicit H atoms in the query must be present (implicit or explicit) in the target (assuming no valence errors like pentavalent C). However, if charge, radical, and valence matching are not all set to exact or true, then implicit H atoms in the query may be absent in the target. For example, an exact search for query CCC will match to target CC[C-] if charge matching is set to default, but not if charge matching is set to exact. In the latter case, not all six implicit H atoms in the query are present in the target.





In substructure searches, implicit H atoms in the query may be absent in the target, even when charge, radical, and valence matching all set to exact or true, because the target may have a heavy atom in place of an implicit H atom in the query.





Perfect searching sets charge, radical, and valence matching to exact or true. Exact and substructure searching sets charge and radical matching to default. setImplicitHMatching(), controlling whether explicit H atoms in the query can match to implicit H atoms in the target, is always enabled except in perfect searches of a database table.





If I have finally gotten this right, I suggest you include it in your JChem query guide.

ChemAxon 42004978e8

13-01-2009 14:23:13

Quote:
setImplicitHMatching(), controlling whether explicit H atoms in the query can match to implicit H atoms in the target, is always enabled except in perfect searches of a database table.


 


It's not enabled for perfect searches in query tables.
Quote:



If I have finally gotten this right, I suggest you include it in your JChem query guide.
 


 Yes you said it (almost totally) right.


Okay, we will inspect which parts are not explained in the documentation.


Bye,


Robert

User 870ab5b546

13-01-2009 14:34:35

Quote:
Quote:
setImplicitHMatching(), controlling whether explicit H atoms in the query can match to implicit H atoms in the target, is always enabled except in perfect searches of a database table.


   


It's not enabled for perfect searches in query tables.


 


Really? See this discussion, specifically,
Quote:
There is only one exception to the rule: perfect search in a database query table. In this case implicit and explicit hydrogen matching is disabled by default, but you can override this by setting IMPLICIT_H_MATCHING_ENABLED.
 
BTW, this new [Toggle WYSIWYG / code] stuff doesn't work properly.

ChemAxon 42004978e8

13-01-2009 14:47:19

Hi Bob,


The quoted forum topic says that for perfect searches on database query tables the implicit h matching is not enabled per default for all other search types it is.


That's just what I wrote. Sorry if it was misunderstandable.


You previously wrote that implicit H matching is disabled for database searches. In my previous post I refined this so that this is only disabled if the database table is a query table.


Bye,


Robert