exclude rule not working properly?

User 870ab5b546

26-04-2009 00:16:12

Hi,


When C[CH-]C#N.OC(=O)C\C=C\CBr is submitted to the reaction definition below, the exclude rule should prevent products from being returned, but it doesn't.  Why not?  We're using JChem 5.1.05.  


[bob@epoch public]$ evaluate -e "dynamicpKb()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
;25.38;;;;;;;;;;
[bob@epoch public]$ evaluate -e "dynamicpKa()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
52.18;;;;3.54;;;32.54;34.09;28.31;28.41;

-- Bob


<?xml version="1.0" ?>
<MDocument>
<MChemicalStruct>
<reaction>
<propertyList>
<property dictRef="NAME" title="NAME">
<scalar><![CDATA[SN2 Nu(-) intermolecular]]></scalar>
</property>
<property dictRef="REACTIVITY" title="REACTIVITY">
<scalar><![CDATA[
totalCharge(reactant(0)) < 0
&& (!match(ratom(3), "[O:1]", 1)
|| match(ratom(3), "[O:1]S(=O)(=O)", 1))
&& (match(ratom(2), "[C:1]#[C,N]", 1)
|| match(ratom(2), "[#6:1]C=[O,N]", 1)
|| match(ratom(2), "[#6:1]C#N", 1)
|| (match(ratom(2), "[O:1]", 1) && !match(ratom(2), "[O:1]C([#6])([#6])[#6]", 1) && !match(ratom(2), "[O:1]S", 1))
|| (match(ratom(2), "[N:1]", 1) && !match(ratom(2), "[#6]C([#6])[N:1]C([#6])[#6]", 1))
|| match(ratom(2), "[F,P,S,Cl,As,Se,Br,Te,I:1]", 1))
&& connections(ratom(1)) == 4
&& (match(ratom(1), "[H][C:1][H]", 1)
|| (match(ratom(1), "[H][C:1]", 1)
&& (match(ratom(2), "[Cl-]") || match(ratom(2), "[Br-]") || match(ratom(2), "") || match(reactant(0), "[N-]=[N+]=[N-]") || apKa(ratom(2)) <= 14)))
]]></scalar>
</property>
<property dictRef="EXPLAIN_REACTIVITY" title="EXPLAIN_REACTIVITY">
<scalar><![CDATA[LG: halides or sulfonates. Nu^-: C(sp), carbonyl or imine or nitrile, O^- that is not t-alkyl-O^-, N^- that is not (sec-alkyl)2N^-, halide or heavier chalcogenide^-. Electrophilic C: CH2, or CH and nucleophile is less basic than HO^-]]></scalar>
</property>
<property dictRef="EXCLUDE" title="EXCLUDE">
<scalar><![CDATA[
dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1") > 8
]]></scalar>
</property>
<property dictRef="EXPLAIN_EXCLUDE" title="EXPLAIN_EXCLUDE">
<scalar><![CDATA[The pKa rule is not violated.]]></scalar>
</property>
<property dictRef="STANDARDIZATION" title="STANDARDIZATION">
<scalar><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<!-- Standardizer configuration file -->
<!-- This configuration file is created with ChemAxon Config Builder -->


<StandardizerConfiguration Version ="0.1">
<Actions>
<Transformation ID="O-enolate to C-enolate" Structure="[#8-]C=C>>[#6-]C=O" Type="string" Groups="reactant1"/> </Actions>
</StandardizerConfiguration>
]]></scalar>
</property>
<property dictRef="Number of reactants" title="Number of reactants">
<scalar>2</scalar>
</property>
</propertyList>
<reactantList>
<molecule molID="m1">
<atomArray
atomID="a1 a2 a3"
elementType="C O H"
mrvMap="1 3 0"
reactionStereo="Inv 0 0"
mrvQueryProps="0 L,O,Cl,Br,I: 0"
x2="-6.881875038146973 -5.341875038146973 -7.280456367604854"
y2="4.042500257492065 4.042500257492065 5.530026029977231"
/>
<bondArray>
<bond atomRefs2="a1 a2" order="1" />
<bond atomRefs2="a1 a3" order="1" />
</bondArray>
</molecule>
<molecule molID="m2">
<atomArray
atomID="a1"
elementType="C"
formalCharge="-1"
mrvMap="2"
mrvQueryProps="L,C,N,O,F,P,S,Cl,As,Se,Br,Te,I:"
x2="-6.641250133514404"
y2="1.1549999713897705"
/>
<bondArray>
</bondArray>
</molecule>
</reactantList>
<productList>
<molecule molID="m3">
<atomArray
atomID="a1"
elementType="O"
formalCharge="-1"
mrvMap="3"
mrvQueryProps="L,O,Cl,Br,I:"
x2="6.400625228881836"
y2="0.5774999856948853"
/>
<bondArray>
</bondArray>
</molecule>
<molecule molID="m4">
<atomArray
atomID="a1 a2 a3"
elementType="C C H"
mrvMap="2 1 0"
mrvQueryProps="L,C,N,O,F,P,S,Cl,As,Se,Br,Te,I: 0 0"
x2="7.56771259790619 7.601281909793624 6.860536746513022"
y2="2.4937202768337303 4.033354357159123 5.383501239196915"
/>
<bondArray>
<bond atomRefs2="a1 a2" order="1" />
<bond atomRefs2="a2 a3" order="1" />
</bondArray>
</molecule>
</productList>
</reaction>
</MChemicalStruct>
</MDocument>



ChemAxon e08c317633

27-04-2009 10:50:57










bobgr wrote:

When C[CH-]C#N.OC(=O)C\C=C\CBr is submitted to the reaction definition below, the exclude rule should prevent products from being returned, but it doesn't.  Why not?  We're using JChem 5.1.05.


[bob@epoch public]$ evaluate -e "dynamicpKb()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
;25.38;;;;;;;;;;
[bob@epoch public]$ evaluate -e "dynamicpKa()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
52.18;;;;3.54;;;32.54;34.09;28.31;28.41;



By "when C[CH-]C#N.OC(=O)C\C=C\CBr is submitted" you mean


- the first reactant (reactant(0)) is "C[CH-]C#N", and


- the second reactant (reactant(1)) is  "OC(=O)C\C=C\CBr"?


If so, then I can explain what happens. None of the atoms in your second reactant ("OC(=O)C\C=C\CBr") has "dynamicpKb()" value (see also this topic), so your exclude rule


dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1") > 8


will look like


NaN - 3.54  > 8


and that will not be true, so the generated products will not be excluded.


You can check if a value of a double is NaN:


$ evaluate -e 'bpka("1") != "NaN"' "CCCCC"
0
$ evaluate -e 'bpka("1") == "NaN"' "CCCCC"
1


I hope this helps.


Zsolt












bobgr wrote:

When C[CH-]C#N.OC(=O)C\C=C\CBr is submitted to the reaction definition below, the exclude rule should prevent products from being returned, but it doesn't.  Why not?  We're using JChem 5.1.05.


[bob@epoch public]$ evaluate -e "dynamicpKb()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
;25.38;;;;;;;;;;
[bob@epoch public]$ evaluate -e "dynamicpKa()" "C[CH-]C#N.OC(=O)C\C=C\CBr"
52.18;;;;3.54;;;32.54;34.09;28.31;28.41;


By "when C[CH-]C#N.OC(=O)C\C=C\CBr is submitted" you mean


- the first reactant (reactant(0)) is "C[CH-]C#N", and


- the second reatant (reactant(1)) is  "OC(=O)C\C=C\CBr"?


If so, then I can explain what happens. None of the atoms in your second reactant ("OC(=O)C\C=C\CBr") has "dynamicpKb()" value (see also this topic), so your exclude rule


dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1")


will look like


NaN - 3.54  > 8


and that will not be true.


You can check if a returned value is NaN:


$ evaluate -e 'bpka("1") != "NaN"' "CCCCC"
0
$ evaluate -e 'bpka("1") == "NaN"' "CCCCC"
1


I hope this helps.


Zsolt


User 870ab5b546

27-04-2009 12:55:29

I guess the previous post is related to this one, then.


(1) How do I handle the NaN value within the context of the exclude rule?  Will this work?


dynamicpKb(reactant(1), "1")  != NaN && dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1") > 8


(2) Hm, I see now that Reactor returns a product for C[CH-]C#N.OC(=O)C\C=C\CBr, which would make C[CH-]C#N reactant(0) and OC(=O)C\C=C\CBr reactant(1), as you suggest, so you may be right.  I had assumed that reactant(0) would correspond to molecule 1 in the MRV definition of the reaction (the electrophile in this case), and reactant(1) would correspond to molecule 2 (the anion).  Is that not the case?  

ChemAxon e08c317633

27-04-2009 20:08:52










bobgr wrote:

(1) How do I handle the NaN value within the context of the exclude rule?  Will this work?


dynamicpKb(reactant(1), "1")  != NaN && dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1") > 8



This should work:


dynamicpKb(reactant(1), "1") != "NaN" && dynamicpKb(reactant(1), "1") - dynamicpKa(reactant(0), "1") > 8


Note: NaN is between quotes.












bobgr wrote:

(2) Hm, I see now that Reactor returns a product for C[CH-]C#N.OC(=O)C\C=C\CBr, which would make C[CH-]C#N reactant(0) and OC(=O)C\C=C\CBr reactant(1), as you suggest, so you may be right.  I had assumed that reactant(0) would correspond to molecule 1 in the MRV definition of the reaction (the electrophile in this case), and reactant(1) would correspond to molecule 2 (the anion).  Is that not the case?




The reactant setting in Reactor is order dependent. If you use Reactor.setReactants(Molecule[]) for setting the reactants, then the first molecule in the array (with index 0) will be the first reactant (reactant(0)); if you use Reactor.setReactants(String) method, then the first fragment in the SMILES string will be the first reactant (reactant(0)). Reactor does not try to guess which reactant is supposed to be the first or the second one, it depends on the setting.


Zsolt

User 870ab5b546

28-04-2009 01:14:39

Now I'm really confused.


Using the reaction definition below, I submit C[O-].CI to the calculator.  The log reads as follows:


Submitting substrate array 1 to Reactor: C[O-].CI
Permutation 1 of 2: CI.C[O-]
Product set 1 from Reactor for permutation 1 is empty; breaking.
Permutation 2 of 2: C[O-].CI
Product set 1 obtained from Reactor for permutation 2.
Initial substrates reacted to give 2 products: .C[O:2][CH3:1]
After adding stereo and removing maps but before specifying unspecified configurations, Reactor products are: .COC

So it appears that only when the reactants are C[O-].CI does Reactor return products -- that is, when the first substrate molecule in the array corresponds to the *second* molecule in the reaction definition, and the second substrate molecule in the array corresponds to the *first* molecule in the reaction definition.  Knowing that Reactor accepts C[O-].CI and not CI.C[O-] does explain why the negatively charged reactant must be reactant(0), but it doesn't explain why Reactor *only* accepts the substrates in the *opposite* order that they are listed in the reaction definition.  *WHY?*


-- Bob


P.S.  I have learned that the order in which I fuse several molecules together is not reflected in the SMILES of the fused molecule.  So I obtain order-dependent SMILES strings by concatenating the SMILES forms of the individual substrate molecules.  


<?xml version="1.0" ?>
<MDocument>
<MChemicalStruct>
<reaction>
<propertyList>
<property dictRef="NAME" title="NAME">
<scalar><![CDATA[SN2 Nu(-) intermolecular]]></scalar>
</property>
<property dictRef="REACTIVITY" title="REACTIVITY">
<scalar><![CDATA[
totalCharge(reactant(0)) < 0
&& (!match(ratom(3), "[O:1]", 1)
|| match(ratom(3), "[O:1]S(=O)(=O)", 1))
&& (match(ratom(2), "[C:1]#[C,N]", 1)
|| match(ratom(2), "[#6:1]C=[O,N]", 1)
|| match(ratom(2), "[#6:1]C#N", 1)
|| (match(ratom(2), "[O:1]", 1) && !match(ratom(2), "[O:1]C([#6])([#6])[#6]", 1) && !match(ratom(2), "[O:1]S", 1))
|| (match(ratom(2), "[N:1]", 1) && !match(ratom(2), "[#6]C([#6])[N:1]C([#6])[#6]", 1))
|| match(ratom(2), "[F,P,S,Cl,As,Se,Br,Te,I:1]", 1))
&& connections(ratom(1)) == 4
&& (match(ratom(1), "[H][C:1][H]", 1)
|| (match(ratom(1), "[H][C:1]", 1)
&& (match(ratom(2), "[Cl-]") || match(ratom(2), "[Br-]") || match(ratom(2), "") || match(reactant(0), "[N-]=[N+]=[N-]") || apKa(ratom(2)) <= 14)))
]]></scalar>
</property>
<property dictRef="EXPLAIN_REACTIVITY" title="EXPLAIN_REACTIVITY">
<scalar><![CDATA[LG: halides or sulfonates. Nu^-: C(sp), carbonyl or imine or nitrile, O^- that is not t-alkyl-O^-, N^- that is not (sec-alkyl)2N^-, halide or heavier chalcogenide^-. Electrophilic C: CH2, or CH and nucleophile is less basic than HO^-]]></scalar>
</property>
<property dictRef="EXCLUDE" title="EXCLUDE">
<scalar><![CDATA[
dynamicpKb(reactant(0), "1") - dynamicpKa(reactant(1), "1") > 8
]]></scalar>
</property>
<property dictRef="EXPLAIN_EXCLUDE" title="EXPLAIN_EXCLUDE">
<scalar><![CDATA[The pKa rule is not violated.]]></scalar>
</property>
<property dictRef="STANDARDIZATION" title="STANDARDIZATION">
<scalar><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<!-- Standardizer configuration file -->
<!-- This configuration file is created with ChemAxon Config Builder -->



<StandardizerConfiguration Version ="0.1">
<Actions>
<Transformation ID="O-enolate to C-enolate" Structure="[#8-]C=C>>[#6-]C=O" Type="string" Groups="reactant1"/>
<Transformation ID="nitrile enolate to C-enolate" Structure="[#7-]=C=C>>[#6-]C#N" Type="string" Groups="reactant1"/>
<Transformation ID="imine enolate to C-enolate" Structure="[#7-]C=C>>[#6-]C=N" Type="string" Groups="reactant1"/>
</Actions>
</StandardizerConfiguration>
]]></scalar>
</property>
<property dictRef="Number of reactants" title="Number of reactants">
<scalar>2</scalar>
</property>
</propertyList>
<reactantList>
<molecule molID="m1">
<atomArray
atomID="a1 a2 a3"
elementType="C O H"
mrvMap="1 3 0"
reactionStereo="Inv 0 0"
mrvQueryProps="0 L,O,Cl,Br,I: 0"
x2="-6.881875038146973 -5.341875038146973 -7.280456367604854"
y2="4.042500257492065 4.042500257492065 5.530026029977231"
/>
<bondArray>
<bond atomRefs2="a1 a2" order="1" />
<bond atomRefs2="a1 a3" order="1" />
</bondArray>
</molecule>
<molecule molID="m2">
<atomArray
atomID="a1"
elementType="C"
formalCharge="-1"
mrvMap="2"
mrvQueryProps="L,C,N,O,F,P,S,Cl,As,Se,Br,Te,I:"
x2="-6.641250133514404"
y2="1.1549999713897705"
/>
<bondArray>
</bondArray>
</molecule>
</reactantList>
<productList>
<molecule molID="m3">
<atomArray
atomID="a1"
elementType="O"
formalCharge="-1"
mrvMap="3"
mrvQueryProps="L,O,Cl,Br,I:"
x2="6.400625228881836"
y2="0.5774999856948853"
/>
<bondArray>
</bondArray>
</molecule>
<molecule molID="m4">
<atomArray
atomID="a1 a2 a3"
elementType="C C H"
mrvMap="2 1 0"
mrvQueryProps="L,C,N,O,F,P,S,Cl,As,Se,Br,Te,I: 0 0"
x2="7.56771259790619 7.601281909793624 6.860536746513022"
y2="2.4937202768337303 4.033354357159123 5.383501239196915"
/>
<bondArray>
<bond atomRefs2="a1 a2" order="1" />
<bond atomRefs2="a2 a3" order="1" />
</bondArray>
</molecule>
</productList>
</reaction>
</MChemicalStruct>
</MDocument>

ChemAxon e08c317633

30-04-2009 21:38:01










bobgr wrote:

Now I'm really confused.


Using the reaction definition below, I submit C[O-].CI to the calculator.  The log reads as follows:


Submitting substrate array 1 to Reactor: C[O-].CI
Permutation 1 of 2: CI.C[O-]
Product set 1 from Reactor for permutation 1 is empty; breaking.
Permutation 2 of 2: C[O-].CI
Product set 1 obtained from Reactor for permutation 2.
Initial substrates reacted to give 2 products: .C[O:2][CH3:1]
After adding stereo and removing maps but before specifying unspecified configurations, Reactor products are: .COC

So it appears that only when the reactants are C[O-].CI does Reactor return products -- that is, when the first substrate molecule in the array corresponds to the *second* molecule in the reaction definition, and the second substrate molecule in the array corresponds to the *first* molecule in the reaction definition.  Knowing that Reactor accepts C[O-].CI and not CI.C[O-] does explain why the negatively charged reactant must be reactant(0), but it doesn't explain why Reactor *only* accepts the substrates in the *opposite* order that they are listed in the reaction definition.  *WHY?*


-- Bob



It is a very weird bug. The order of the reactants depends on the arrangement (position) of the reactant moleclules in the reaction definiton (see sn2.png). The lower molecule ("[#6,#7,#8,F,#15,#16,Cl,#33,#34,Br,#52,I;-:2]") is recognized as first reactant, and the upper molecule ("[H][C:1][#8,Cl,Br,I:3]") as second. After repositioning the reactants (see the repositioned reactants on sn2mod.png) the reaction works correctly:


$ react -r sn2mod.mrv -n rs "CI" "C[O-]"
COC


$ react -r sn2mod.mrv -n rs "C[O-]" "CI"


(no products in 2nd case)


It is clearly a bug, and it will be fixed in the next patch release (JChem 5.2.2). This bug will show up quite rarely, and the workaround is to reposition the reactants.


Thanks for the report, and sorry for the inconvenienece.


Zsolt

User 870ab5b546

02-05-2009 01:43:55

When you fix the bug, I'm going to have to change my reaction definitions, aren't I?  Sigh...


Do you have any idea what circumstances cause this bug to rear its ugly head?

ChemAxon e08c317633

04-05-2009 09:02:45










bobgr wrote:

When you fix the bug, I'm going to have to change my reaction definitions, aren't I?  Sigh...



No, you won't have to change your reactions, after the fix the reactant order will be recognized correctly.












bobgr wrote:

Do you have any idea what circumstances cause this bug to rear its ugly head?



If the "centers" of the reactants are close to each other horizontally.


Zsolt


User 870ab5b546

04-05-2009 16:04:29










Zsolt wrote:










bobgr wrote:

When you fix the bug, I'm going to have to change my reaction definitions, aren't I?  Sigh...



No, you won't have to change your reactions, after the fix the reactant order will be recognized correctly.


 


Zsolt


 



Not if I have already corrected the definitions so that they work now.