Problems with pKa plug-in accuracy

User 0cf6155cf3

10-09-2008 01:15:00

To whom it may concern,





The Bioinformatics Research Group at SRI is trying to use the Marvin


API to correctly and consistently protonate the compounds in our


EcoCyc Pathway / Genome Database (EcoCyc.org) based off of the


reference cytosolic pH of 7.3 for E. coli.





After running over a thousand of our chemical compounds with


structures through the Marvin Calculator plug-in for pKa's


(chemaxon.marvin.calculations.pKaPlugin) to obtain the major pKa


values, we chose to sample the results for verification by taking the


compounds that had at least one pKa from Marvin that fell within the


range of 6.3 to 8.3 pH units.





We then looked up these compounds in the IUPAC book "Ionisation


constants of organic acids in aqueous solution" (see link to


bibliographic information below) to validate that the Marvin pKa's


were close to the experimentally-verified values.





Nine of the compounds that we were able to find in the IUPAC book had


pKa values that were different from the Marvin pKa values by greater


than 0.5 pH units. They are listed below.





One thing that we've noticed is that of the ones we've identified, all


but one of them have phosphate groups. We suspect that perhaps the


algorithm that Marvin uses for calculating the pKa values isn't


performing as well for phosphate groups.





For our research purposes, it is very important that we can rely on


the output of Marvin to be as accurate as possible. We are requesting


assistance in pinning down the cause of this discrepancy, and possibly


in tuning the pKa calculation algorithm that Marvin employs.





Regards,





~Tomer Altman








Compounds and their pKa's (with the Marvin value followed by the IUPAC value):





Code:
adenosine                 (12.87 , 12.35)


guanosine triphosphate    (7.12 , 7.65)


cytidine-5'-triphosphate  (7.12 , 7.65)


adenosine-5'-phosphate    (6.78 , 6.23)


glucose-6-phosphate       (6.77 , 6.11)


adenosine-5'-diphosphate  (7.21 , 6.44)


fructose-6-phosphate      (6.77 , 5.84)


thymidine-5'-triphosphate (9.36 , 10.7)


xanthine                  (6.48 , 7.53)









Bibliographic reference:





"Ionisation constants of organic acids in aqueous solution"


http://jenson.stanford.edu/uhtbin/cgisirsi/atSfEQq31Q/GREEN/266980024/9

User 851ac690a0

16-09-2008 20:54:01

Hi,





Thank you for your feedback.


We are going to consider your requirements in the development of the pKa algorithm.








Jozsi

User 0cf6155cf3

16-09-2008 23:58:51

Thank you for your reply.





What is the time-frame for this evaluation? Our research project is held up pending resolution of this issue. We deem it urgent.





Regards,





~Tomer Altman





SRI, International

User 851ac690a0

17-09-2008 15:40:39

Hi,








I have fixed the pKa calculation problems.


Code:



adenosine                            (12.45 , 12.35)     //alcohol


guanosine triphosphate         (7.42 , 7.65)         //phosphate


cytidine-5'-triphosphate         (7.42 , 7.65)        //phosphate


adenosine-5'-phosphate        (6.10 , 6.23)        //phosphate


glucose-6-phosphate             (6.35 , 6.11)       //phosphate


adenosine-5'-diphosphate      (6.83 , 6.44)       //phosphate


fructose-6-phosphate            (6.35 , 5.84)       //phosphate


thymidine-5'-triphosphate      (9.96 , 10.7)       //NH-acid


xanthine(7.33 , 7.53)      //NH-acid






Experimental pKa of thymidine-5'-triphosphate is 10.7. This value seems to be too large. May be it is not correct experimental value. We plan a patch release at the next week.





Jozsi

User 0cf6155cf3

24-09-2008 00:26:17

Thanks for this update, Jozsi.





If I download Marvin now, will it have the code patches?





Thanks,





~T

User 851ac690a0

24-09-2008 16:50:42

Hi,








The newest patch release (5.1.2) will be available in a few days.





Jozsi

User 0cf6155cf3

04-11-2008 21:40:25

We have re-processed our compounds with the new version of Marvin Beans.





The number of compounds that had a significant discrepancy (we define that as greater than 0.5 pH units) between the Marvin calculated pKa's and the values found in the literature was reduced from 9 to 4:





Code:



glucose1-phosphate       ( 5.819 , 6.5  )


GTP                      ( 6.819 , 7.65 )


cytidine-5'-triphosphate ( 6.819 , 7.65 )


fructose-6-phosphate     ( 6.347 , 5.84 )








Please take a look at these compounds. Is there a particular reason why the pKa calculation might be off for these compounds?





The experimental values were from the same IUPAC book that I referred to in a previous message.





Thank you.

User 851ac690a0

05-11-2008 12:19:02

Hi,








I have calculated the pKa of the first component (glucose-6-phosphate) and my result is not the same than yours.


Could you send the version number of the release that you downloaded?








Jozsi

User 0cf6155cf3

17-11-2008 17:07:56

Marvin 5.1.2


Build Date: 2008-10-01





The first compound is glucose-1-phosphate, not glucose-6-phosphate (sorry for the missing hyphen).





Do you get a different value for glucose-1-phosphate?





Thanks,





~Tomer

User 851ac690a0

24-11-2008 11:13:10

Hi,
Quote:
Do you get a different value for glucose-1-phosphate?
I obtained the same pKa (5.80) as you.





I improved the pka calculation according to these data:








http://www.biochemj.org/bj/059/0203/bj0590203_browse.htm#

















Jozsi

User 0cf6155cf3

16-11-2010 00:20:40

Hi Jozsi, or whomever it may regard at ChemAxon.


We have found additional problems with the protonation of phosphate moieties in ATP. Below please find a description of the problem from Dr. Ron Caspi at MetaCyc.org:


---


First, I re-evaluated Marvin's opinion [using MarvinSketch 5.3]. Marvin thinks that at pH 7.3 57%
of the ATP molecules will be at the -3 form, while only 43% will be at
the -4 form. So it picks the -3 form. But as Tomer mentioned, this is
not exactly a black vs white issue due to the pKa values of ATP.



I also did some literature search, and found an informative 2006 paper,
which I attach [http://www.ncbi.nlm.nih.gov/pubmed/16488529].


As you can see in the abstract, "A selection of literature data on ATP 
protonation constants and on activity isopiestic coefficients was
performed, together with new potentiometric measurements (by ISE-H+,
glass electrode). Both literature and new experimental data were used to
model the dependence on ionic strength and ionic medium of ATP
protonation by SIT (Specific ion Interaction Theory) and Pitzer equations.".



So, luckily for us, these authors, who understand about this more than
us (one would hope) have already looked at previously known information.



The final conclusion of the paper can be found in figures 4 and 5.
Looking at pH 7.3, at very low ionic strength (figure 4)  the -4 and -3
forms are very close, while at high ionic strength (figure 5) the -4
form has a clear majority. However, even at very low ionic strength, the
-4 form is more prevalent than the -3 form. This contradicts Marvin.


---


Please let us know why Marvin's results differ than these experimental results. It is important for us to have the precise protonation state of ATP, ADP, AMP, and other phosphate-moiety bearing compounds, and to have them be consistent relative to one another.


Thank you,


~Tomer

User 851ac690a0

16-11-2010 14:48:54

Hi,


 


Please input the "ATP"  into the Marvin sketch at this site http://www.chemaxon.com/marvin/sketch/index.php


and run the pKa calculation.


You need to get the next figure with the tabulated  macro-species distribution. As you can see the leader form of  the "ATP"  is the "ATP(-4)" at the highlited pH values.  This is in agreement with your exception.


 


at pH= 7.2   :        55.56%  ATP(-4)


at pH =7.4  :         66.53% ATP(-4)  


 


Jozsi

User 0cf6155cf3

14-01-2011 01:16:02

Hi Jozsi,


Thanks for your reply. Dr. Caspi and I will look at the performance of v5.4 with ATP & similar compounds.


 


Dr. Caspi has also noticed another problem, this time with a compound without phosphate moieties. His notes:



Open arsenite in Marvin. If you invoke the major microspecies
function, it reports that the form that has three protonated oxygens
is the major species at pH 7.3.



But - if you invoke the pKa analysis, you will find out that at pH
7.3 it is actually form number 2 (the one that has one unprotonated
oxygen) that is the dominant form. So the two analyses do not agree
with each other.



I'm attaching our MOL-file for arsenite. I have also reproduced this problem using the v5.4 web applet from the ChemAxon site.


Thanks,


~Tomer

User 851ac690a0

09-02-2011 17:02:47

Hi,


 


I fixed this bug. The fix  will available  in the 5.4.2  version.


 


Thank you for reporting this bug.


Jozsi

User 0cf6155cf3

11-02-2011 02:01:18










Jozsi wrote:

Hi,


 


I fixed this bug. The fix  will available  in the 5.4.2  version.


 


Thank you for reporting this bug.


Jozsi



Hi Jozsi,


Could you give us a rough estimate for when the 5.4.2 version will be available?


Thanks,


~Tomer

User 851ac690a0

11-02-2011 07:46:55

Hi,


 According to my best knowledge the 5.4.2 will be released in this time intervallum.[ febr. 15.  ... end of april ]


Please note that the  major microspecies bug  related with  symmetric molecules.


You can continue your  major microspecies predicition for unsymmetric molecules until the 5.4.2 versionn will be available. 


On the other hand the [%] distributin in Marvin GUI is correct  even for symmetric molecules. So you can obtain the major micropsecies  according to the [%] distribution data.


Jozsi

User 9be621d283

24-02-2011 13:29:54

Hi Jozsi,


We also make heavy use of the ChemAxon tool for making the Rhea database (http://www.ebi.ac.uk/rhea), where we also make use of compounds at pH 7.3, so we are also eagerly awaiting the new release, and hope it will be soon.


Regarding the tool at http://www.chemaxon.com/marvin/sketch/index.php. Does it include the patch?


We just tested ATP in this version of the tool, and while the pKA tool predicts as you noted in your post Nov. 16, the tool Protonation:Major Microspecies in the Tools menu still shows the ATP(3-) at pH 7.3. So if the patch is already used for this version of the tool, there is still a problem.


Regards,


Anne Morgat

User 851ac690a0

24-02-2011 21:37:26

Hi,


Thank you for your follow-up letter.


The "ATP" will be OK in the outcoming release. 


Jozsi

User 786a257177

18-05-2011 08:46:56

Dear all,


We work with the Rhea database at the EBI, using compounds normalised to pH 7.3. To that end, we use MarvinBeans and are upgrading from 5.2.05.1 to 5.5.0.0.


With the latest MarvinBeans version (5.5.0.0) we realised that the major microspecies for ATP at pH 7.3 is ATP(4-) instead of ATP(3-) calculated with v5.2.05.1. I see in previous posts that this is a fix introduced in latest releases.


However we also noticed that v5.2.05.1 calculated ADP as ADP(3-) at pH 7.3, while now 5.5.0.0 returns ADP(2-). Is this intended/accurate? Sorry, I don't have the expertise, just asking as this affects a lot of reactions in our database.


Thanks in advance,


Rafael

User 9be621d283

18-05-2011 13:16:27










RafaelAlcantara wrote:

Dear all,


We work with the Rhea database at the EBI, using compounds normalised to pH 7.3. To that end, we use MarvinBeans and are upgrading from 5.2.05.1 to 5.5.0.0.


With the latest MarvinBeans version (5.5.0.0) we realised that the major microspecies for ATP at pH 7.3 is ATP(4-) instead of ATP(3-) calculated with v5.2.05.1. I see in previous posts that this is a fix introduced in latest releases.


However we also noticed that v5.2.05.1 calculated ADP as ADP(3-) at pH 7.3, while now 5.5.0.0 returns ADP(2-). Is this intended/accurate? Sorry, I don't have the expertise, just asking as this affects a lot of reactions in our database.


Thanks in advance,


Rafael




To complement the post of Rafael, it is clear that the major species of ADP at pH 7.3 has a net charge of -3.
It seems that the bug is similar to what we had before with ATP; i.e. an inconsistency between the tabulated distribution and the selection of the major microspecies.

Best regards,

Anne

User 0cf6155cf3

18-05-2011 17:22:09










morgat wrote:










RafaelAlcantara wrote:

Dear all,


We work with the Rhea database at the EBI, using compounds normalised to pH 7.3. To that end, we use MarvinBeans and are upgrading from 5.2.05.1 to 5.5.0.0.


With the latest MarvinBeans version (5.5.0.0) we realised that the major microspecies for ATP at pH 7.3 is ATP(4-) instead of ATP(3-) calculated with v5.2.05.1. I see in previous posts that this is a fix introduced in latest releases.


However we also noticed that v5.2.05.1 calculated ADP as ADP(3-) at pH 7.3, while now 5.5.0.0 returns ADP(2-). Is this intended/accurate? Sorry, I don't have the expertise, just asking as this affects a lot of reactions in our database.


Thanks in advance,


Rafael




To complement the post of Rafael, it is clear that the major species of ADP at pH 7.3 has a net charge of -3.
It seems that the bug is similar to what we had before with ATP; i.e. an inconsistency between the tabulated distribution and the selection of the major microspecies.

Best regards,

Anne



I have also reproduced this error here at SRI, using MarvinBeans for Linux 5.5.0.1, as downloaded this morning.


Having a reliable means of determinign the major microspecies of a compound at a specified pH at standard temperature and pressure is vital for the construction of balanced reaction databases such as MetaCyc and RHEA. Can we get an ETA on when this problem will be fixed?


Thank you very much,


~Tomer Altman

User 0cf6155cf3

18-05-2011 17:37:12










taltman wrote:










morgat wrote:










RafaelAlcantara wrote:

Dear all,


We work with the Rhea database at the EBI, using compounds normalised to pH 7.3. To that end, we use MarvinBeans and are upgrading from 5.2.05.1 to 5.5.0.0.


With the latest MarvinBeans version (5.5.0.0) we realised that the major microspecies for ATP at pH 7.3 is ATP(4-) instead of ATP(3-) calculated with v5.2.05.1. I see in previous posts that this is a fix introduced in latest releases.


However we also noticed that v5.2.05.1 calculated ADP as ADP(3-) at pH 7.3, while now 5.5.0.0 returns ADP(2-). Is this intended/accurate? Sorry, I don't have the expertise, just asking as this affects a lot of reactions in our database.


Thanks in advance,


Rafael




To complement the post of Rafael, it is clear that the major species of ADP at pH 7.3 has a net charge of -3.
It seems that the bug is similar to what we had before with ATP; i.e. an inconsistency between the tabulated distribution and the selection of the major microspecies.

Best regards,

Anne



I have also reproduced this error here at SRI, using MarvinBeans for Linux 5.5.0.1, as downloaded this morning.


Having a reliable means of determinign the major microspecies of a compound at a specified pH at standard temperature and pressure is vital for the construction of balanced reaction databases such as MetaCyc and RHEA. Can we get an ETA on when this problem will be fixed?


Thank you very much,


~Tomer Altman



I forgot to attach the MOL files for three compounds that exhibit this problem: ADP, FAD, and FMN.


 


~T

User 851ac690a0

20-05-2011 13:52:04

Hi,


I've fixed the inconsistency.


The fix will be available ... soon.


Thanks  to everybody who contributed for fixing this bug, either by sending a couple of structures or/and a valuable comments.


Jozsi

User 9be621d283

10-06-2011 07:19:01

Hi,


How do you explain the large discrepancy  between the theoritocal prediction of o-phosphoserine published in the following article (pka3 of 9.85)  and pKa calculation plug-in (pka3 of 9.85). 


Consequently, the major species at pH 7.3 has a net charge of (-3) whereas a net charge of (-2) is expected.


Theoretical pKa prediction of O-phosphoserine in aqueous solution


Chemical Physics Letters Volume 501, Issues 1-3,
6 December 2010,
Pages 123-129


http://www.sciencedirect.com/science/article/pii/S0009261410014442


best regards,


Anne




User 851ac690a0

10-06-2011 13:52:41

Hi,


How do you explain ...

At pH = 7.3  the three "OH" groups are deprotonated and the basic group is protonated, this is why  the net charge is  (-2) , it is calculated according to this relation:  3*( -1 ) + 1*( +1 ) = ( -2 ).


The predicted pKa of the "NH2" group was  too low in our calculation , therefore the "NH2" group  considered to be unprotonated in the major form at the given pH and  the resultant charge is :  3*(-1) + 0*(+1) = (-3).


I  have improved the amine pKa calculation in our pKa calculator. The improved version will be available soon.


Thanks for the feedback.


Jozsi

User 0cf6155cf3

10-06-2011 21:22:32










Jozsi wrote:

Hi,


How do you explain ...

At pH = 7.3  the three "OH" groups are deprotonated and the basic group is protonated, this is why  the net charge is  (-2) , it is calculated according to this relation:  3*( -1 ) + 1*( +1 ) = ( -2 ).


The predicted pKa of the "NH2" group was  too low in our calculation , therefore the "NH2" group  considered to be unprotonated in the major form at the given pH and  the resultant charge is :  3*(-1) + 0*(+1) = (-3).


I  have improved the amine pKa calculation in our pKa calculator. The improved version will be available soon.


Thanks for the feedback.


Jozsi



Jozsi, could you please reply to this thread with the exact version number for Marvin that will have all of these fixes, and when that version is scheduled to be available? This would help with MetaCyc and ChEBI being able to make plans to use the fixed version of Marvin.


Thanks,


~Tomer

User 851ac690a0

10-06-2011 22:10:43

Hi,


There are many small changes in the Marvin and it is impacted  on our other software  packages as well.  The last successful official build info (2011-05-12)  is on the attached picture.


The exact  time of the patch release depends on many other factors not only on these pKa fixes. This is why I just approximately can say you that there will be a release in a week.


 


Jozsi


 


 


 

User 460fd82ff5

07-08-2011 22:36:00

Hi,


When using the command line program 'cxcalc pka' on compounds with a carboxylic group C(O)=O, the atom which is related to the pKa of the acid is the O with the double bond, instead of the one with the single bond.


For example, acetic acid - CC(O)=O :


 


 


 OpenBabel08081101282D


  4  3  0  0  0  0  0  0  0  0999 V2000


    0.0000    0.0000    0.0000 C   0  0  0  0  0


    0.0000    0.0000    0.0000 C   0  0  0  0  0


    0.0000    0.0000    0.0000 O   0  0  0  0  0


    0.0000    0.0000    0.0000 O   0  0  0  0  0


  1  2  1  0  0  0


  2  3  1  0  0  0


  2  4  2  0  0  0


M  END


 


 


returns this result after running 'cxcalc pka':


 


id apKa1 apKa2 apKa3 apKa4 apKa5 bpKa1 bpKa2 bpKa3 bpKa4 bpKa5 atoms


1 4.54 4



This problem does not occur if the input is the deprotonated form of acetate: CC([O-])=O

 

ChemAxon e08c317633

09-08-2011 13:23:41

Hi,


It returns the correct result for me.


$ cxcalc pka "CC(O)=O"
id apKa1 apKa2 bpKa1 bpKa2 atoms
1 4.54 3

$ cxcalc pka "CC([O-])=O"
id apKa1 apKa2 bpKa1 bpKa2 atoms
1 4.54 3

In both cases the atom with index 3 has the acidic pKa, which atom is the oxigen in the OH group.


Note: atom indexes in SMILES and SDF format can be different.


Zsolt


 

User 460fd82ff5

09-08-2011 14:08:00

Hi,


 


I forgot to mention that I was using the "-M true" flag, which is probably what caused the misunderstanding.


 


Thank you,


Elad.

ChemAxon e08c317633

23-08-2011 15:42:54










eladnoor wrote:

I forgot to mention that I was using the "-M true" flag, which is probably what caused the misunderstanding.



You are right, if the -M option is used then the atom indexes related to the pKa are not always reported correctly. We will fix it.


Zsolt

User 460fd82ff5

05-09-2011 11:08:46

Hi,


 


It seems that running cxcalc pka on sulfide - [SH2] - returns no known pKa values. However, ChemAxon does identify correctly the microspecies, since majorms returns different results for pH 6 and 7:


 



$ cxcalc majorms --pH 7 "[SH2]"
id major-ms
1 [SH-]
 


$ cxcalc majorms --pH 6 "[SH2]"
id major-ms
1 S




Is this a bug or am I using the wrong arguments?


Thanks,


Elad


User 851ac690a0

05-09-2011 12:18:43

Hi,


Yes , there is a bug. It will be fixed asap.


The pKa values are  printed with the next command line , try this :


cxcalc pka -P dynamic "[SH2]"
id      apKa1   apKa2   bpKa1   bpKa2   atoms
1       6.66    13.80   -8.97           1,1,1


Jozsi


 

User 5a32847b52

01-05-2012 14:07:21

I have tried to improve the accuracy of calculated pKa values for polyphosphate derivatives by using a library with literature data for monophosphates with the same modifications. For example the reported pKa values for methylthio-monophosphate are 0.62 and 4.86. Without correction, Marvin 5.9.3 calculates these as 2.47 and 7.22, with correction library as 1.14 and 5.89.


More importantly, I want to predict the pKas in di- and triphosphates where one OH is replaced with the methylthio group. Curiously, applying the correction here has no effect on any of the calculated pKas of methylthio diphosphate. However, if I replace the acidic OH groups at the second phosphorous atom with another group  or atom, the effect becomes notable again. The same goes for other modifications I have tested- all result in quite large differences between literature and calculated pKas in the monophosphates, but the correction works for diphosphate derivatives only if the second phosphorous atom carries no OH groups.


I am using Marvin and InstantJChem 5.9.3.  Do you have any suggestions for me?


Thomas


 








 

ChemAxon e08c317633

18-05-2012 13:50:55

Thank you for reporting this problem, we will check it and get back to you. We answered you in email as well.


Zsolt 

User 53611d53ae

24-07-2012 18:50:32










Jozsi wrote:

Hi,


I've fixed the inconsistency.


The fix will be available ... soon.


Thanks  to everybody who contributed for fixing this bug, either by sending a couple of structures or/and a valuable comments.


Jozsi



Hello Jozsi,


Marcus Ennis from ChEBI just alerted me to the fact that in version 5.9 the major species predicted for GDP at pH 7.3 is the 2-minus
species instead of the 3-minus species (similar to other problems described earlier in this thread).


 


you wrote that you fixed the problem, but I just checked with 5.10 and the 2-minus is still predicted. Did your fix not get incorporated into version 5.10?


Cheers,


Ron Caspi


SRI International

User 53611d53ae

24-07-2012 18:55:45

Joszi,


I need to be more precise: I just checked the protonation of ADP in 5.10, and it is minus-3 as it should be. So I guess your fix did make it into this version.


However, GDP comes up as minus-2, even though previous versions of Marvin (e.g. 5.2.2) predicted it to be minus-3.


Ron

ChemAxon afdac7b783

25-07-2012 08:24:26

Dear Ron,



Thank you for your feedback.

Jozsi is on holiday, and he will deal with this topic when he comes back.


Best Regards,



Viktoria

User 851ac690a0

30-07-2012 11:00:46

Hi,


 


 The calculation is at leaset good. It is shown on the attached figure.


The charge of the "GDP" major form  is   (-3) at pH=7.3. This leader ionic form is presnet in  61.16% 


The "ATP" major form has charge (-4) at pH=7.3. Its relative amount is  61.19%


 


Something is wrong in the major form representation. We fix this bug asap.


Thanks for reporting this problem.


 


Jozsi

User 53611d53ae

30-07-2012 15:28:59

Hi Jozsi,


Welcome back from your vacation!


Thanks for looking into it and figuring out the problem, we are looking forward for a quick fix. Hopefully it could go into a 5.10.x release and not wait for 5.11.


Best,


Ron

User 851ac690a0

02-08-2012 13:20:15

Hi,


I've attached a figure about the 'resultant charge' distribution of  the ADP/ATP ,  GDP/ATP and  IDP/ITP.   The six molecules are included in the attached 'sdf' file.


The "resultant charge" values are  based on the accumulated charges of the individual microspecies which are co-exists together in the solution at a given pH. So this value is a reliable parameter for characterising a molecule in respect  of  the  'charge-state'.


May be the rounded (to integer) "resultant charge" value can support  your  development efforts..


 


What we do now is a simple evaluation of the "major ionic form" from a single ionic species. This method is very sensitive for the accuracy of the calculated pKa values. 


 


corollary :  Combination of the "resultant charge" and the  "major ionic form" methods have to have result in a better consensus model.


I am working on this now. And it will be available in the 5.11 version in September.


 


Jozsi

User 53611d53ae

22-10-2012 23:17:12

Hello ChemAxon,


After some exchanges between ChEBI and SRI we confirmed a new problem with the protonation in version 5.10 (I am using 5.10.1_b76).


In the compound alpha-D-ribose-1-methylphosphonate-5-triphosphate (see attached image, this compound will be CHEBI:68685 at some point), at pH 7.3 the oxygen at the end of the triphosphate group should be charged. It worked properly in version 5.3.6, which is still used by some ChEBI curators. However, version 5.10 predicts that this oxygen retains it hydrogen - I attach a screen capture. I believe that the prediction of 5.10 is incorrect.


Ron

User 851ac690a0

24-10-2012 17:06:57

Hi,


 


An improved version,based on the consensus model I proposed in my former post above,  will be available in the 5.12 release on December.


Thanks for reporting this problem.


Jozsi

User 9be621d283

21-01-2013 10:07:14

Hi Jozsi, ChemAxon,

We have just made a major re-evaluation of the protonation state of all the compounds (4000) we use in the Rhea reaction database.


We used the release 5.10 of the pKa calculation plugin to calculate the Major Microspecies at pH 7.3 (the reference pH for Rhea) and realized that a lot of changes appeared compared to earlier results. We did not notice any change (improvement) using the latest release from December (5.11.4).


In the below examples, we have compared MarvinView version 5.11.4 with the version 5.2.2, which corresponds to the version used by the ChEBI public web site.


Here are some examples with strange or erroneous behaviour for the computation of major microspecies at pH 7.3













































CHEBI:58395 myricetin http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A58395
CHEBI:57769 precorrin-4 http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A57769
CHEBI:57830 cyanidin 3-O-rutinoside 5-O-beta-D-glucoside betaine http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A57830
CHEBI:29323 methylazoxymethanol http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A29323
CHEBI:67135 2-nitroimidazole http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI%3A67135
nucleotides
selenous acid
tellurous acid

 



 
Nucleotides: while ATP and GTP have Major Microspecies with charge -4, UTP, TTP and XTP MMs have a net charge of -3. This is an unexpected behaviour!

We also discovered an additional problem with inorganic acids.
For selenous acid, the Marvin plugin calculation indicates 2 pKa's around 12 and 17, whereas other sources provide values around 2.5 and 7.3 (or 8)
(http://en.wikipedia.org/wiki/Selenous_acid,http://www2.ucdsb.on.ca/tiss/stretton/database/polyprotic_acids.htm)


For tellurous acid, the Marvin plugin calculation indicates 2 pKa's
around 13.5 and 18, whereas other sources provide values around 2.5 and
7.7.
(http://en.wikipedia.org/wiki/Sodium_tellurite)


You will find attached to this post an SDF file
(structures_to_check.sdf) containing a set of compounds that were
calculated to be the major structure at pH 7.3 using earlier version of
Marvin plugin, but are not using latest version (5.11.4).
We would
very much appreciate if you would help us check the compounds in the
list and determine the major microspecies at pH 7.3.


Thanks for looking into it and figuring out the problem.


Best,



Anne

User 851ac690a0

21-01-2013 14:09:31

Hi Morgat,


 


Thank you for reporting these bugs.


I check them and go back to  you asap.


 


Jozsi

User 851ac690a0

29-01-2013 17:09:57

Hi,


 


1.  
 CHEBI:58395     myricetin     
Deprotonation of flavanol is prefered at the 4'-th or on the 7-th position.
See the proposed major deprotonated form at pH=7.3 on the attached figure.


2.
CHEBI:57769     precorrin-4     
The proposed net charge is  -7.  Because not only the carboxyl groups are deprotonated
but there is a "5 atoms length delocalization path"  for holding one more proton.
See on the attached figure.


3.
CHEBI:57830     cyanidin 3-O-rutinoside 5-O-beta-D-glucoside betaine     
The proposed net charge of the major form  is -1 at pH=7.3.


I fixed a bug in the major fom calculation.The fix will be available in the 5.12 version.


See on the attached figures.


4.
CHEBI:29323     methylazoxymethanol     
This bug was fixed in the released version 5.11.5.


 


5.


CHEBI:67135     2-nitroimidazole    


The proposed net charge is -1 at pH 7.3.


Because the experimental pKa is 7.1


6.


major form of nucleotides     
The calculation will be more correct with the consensus model which will be available in the 5.12 version. As I mentioned above.


7.


 selenous acid / tellurous acid
Yes, their pKa values are  about 2.0 and 7.0 according to the Pauling's rule.
I implemented this rule in the pKa calculator and it will be available in the 5.12 version.


 


If you have more problematic structures  just send them.


 


Jozsi

User 3b25c83a0c

04-04-2013 07:30:18

Hi Jozsi,


 


Thank you for your reply. Unfortunately, we made a mistake in the list and did not point to the most important problem. This concerns the nitramide groups. An example of such a compound is  CHEBI:67135 2-nitroimidazole 
(item 5 in the list). The overall calculation of charge is correct (1-) but this was not the main problem.


The main problem is that the number of bonds to one of the the N atom is wrong.


The most simple example is nitramide http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:29273 and other examples are http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:19075 and http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:25296.


In all cases the N linked to 2 O atoms has 4 bonds (a double bond to one O, and a single bond to the other) and has a + charge. The O attached with the single bond has a - charge. This results in the N being in oxidation state III.


When I perform a major microspecies calculation with 5.12.0 (and also previous versions), in all cases the 2 O's are now both connected with double bonds to the N (that has no charge) but this results in the N being in oxidation state V and that is not correct.


If I add an H to the O- and again ask for the major microspecies, then the N is not changed to N(V).


 


Regards,


 


Kristian

User 851ac690a0

08-04-2013 16:33:27

Hi,


 


There are two problems about   the  "NO2" groups .


1.


In our system  the "NO2" groups may have  two alternative forms.  The "traditional" and  the "ylide" forms.


More details given at the  "15th  Group"  in this link below


http://www.chemaxon.com/marvin/help/sci/ValenceCalculator.html


All calculation will return exclusively the "ylide" form of the "NO2" type nitrogen in the 6.0  or in the 6.1 version


 


2.


The conjugated acid of the "NO2" group is generated  after adding a "H" atom to the "O-" atom. The pKa calculation failed for  this extra strong acid (pKa is about -11.0). This bug will be fixed in the 6.0.


 


Thanks for reporting these problems.


 


Jozsi

User 3b25c83a0c

02-08-2013 11:05:37

Hi Jozsi,


 


You write:




All calculation will return exclusively the "ylide" form of the "NO2" type nitrogen in the 6.0  or in the 6.1 version


I have just tested chebi:27798, nitrobenzene, and when I ask for the Major Microspecies at pH 7.3, Marvin 6.0.3 returns "the traditional form", see attached.


So unfortunately the problem has not been solved.


 


Regards,


Kristian


 

User 851ac690a0

08-08-2013 23:42:14

Hi,


I have fixed the ylide issue of the "NO2" group.  The major forms of 3 typical structures are shown on the attached figures, at pH=1 and at pH=10. 


I very hope that this fix will be available in the 6.1 version.


Jozsi

User 3b25c83a0c

03-09-2013 12:04:38

Dear Jozsi,


 


When looking for the Major Microspecies in verion 6.0.3, I sometimes get different results from the Major Microspecies tool and the pKa calculation. I hope you can correct this error.


For example:
For ribose 5'-triphosphate, I get the -4 species as the most prevalent at pH 7.3 using the Major Microspecies tool, but the -3 using the pKa tool.


 


Likewise,


for phytyl diphosphate (CHEBI:18187), I get the -3 species as the most prevalent at pH 7.3 using the Major Microspecies tool, but the -2 using the pKa tool.


 


Thank you in advance for looking into this.


Regards,


Kristian

User 851ac690a0

15-11-2013 00:52:01

Hi,


 



 ..Major Microspecies tool and the pKa calculation... 

The "Major Microspecies" tool  is not just a simple "reading off"  the largest value of the  "microspecies distribution".


A couple of "risk factors" are also built in the model of the "Major Microspecies" tool.


Sometimes, therefore,  there is a difference between the predicted major forms which are generated from the "curve distribution" model or even by  the direct "Major Microspecies" model.


Especially in case of the multi-ionizable molecules  the "Major Microspecies" tool  creates reliable ions which are in contrast to the "curve distribution" diagram.


 


I propose you to use the "Major Microspecies" tool instead of  calculating the major  form from the "distributin curves".


 Of course we try to improve the GUI interface in this issue.


Thanks.


Jozsi