The Bioinformatics Research Group at SRI is trying to use the Marvin
API to correctly and consistently protonate the compounds in our
EcoCyc Pathway / Genome Database (EcoCyc.org) based off of the
reference cytosolic pH of 7.3 for E. coli.
After running over a thousand of our chemical compounds with
structures through the Marvin Calculator plug-in for pKa's
(chemaxon.marvin.calculations.pKaPlugin) to obtain the major pKa
values, we chose to sample the results for verification by taking the
compounds that had at least one pKa from Marvin that fell within the
range of 6.3 to 8.3 pH units.
We then looked up these compounds in the IUPAC book "Ionisation
constants of organic acids in aqueous solution" (see link to
bibliographic information below) to validate that the Marvin pKa's
were close to the experimentally-verified values.
Nine of the compounds that we were able to find in the IUPAC book had
pKa values that were different from the Marvin pKa values by greater
than 0.5 pH units. They are listed below.
One thing that we've noticed is that of the ones we've identified, all
but one of them have phosphate groups. We suspect that perhaps the
algorithm that Marvin uses for calculating the pKa values isn't
performing as well for phosphate groups.
For our research purposes, it is very important that we can rely on
the output of Marvin to be as accurate as possible. We are requesting
assistance in pinning down the cause of this discrepancy, and possibly
in tuning the pKa calculation algorithm that Marvin employs.
Compounds and their pKa's (with the Marvin value followed by the IUPAC value):
|adenosine (12.87 , 12.35) |
guanosine triphosphate (7.12 , 7.65)
cytidine-5'-triphosphate (7.12 , 7.65)
adenosine-5'-phosphate (6.78 , 6.23)
glucose-6-phosphate (6.77 , 6.11)
adenosine-5'-diphosphate (7.21 , 6.44)
fructose-6-phosphate (6.77 , 5.84)
thymidine-5'-triphosphate (9.36 , 10.7)
xanthine (6.48 , 7.53)
"Ionisation constants of organic acids in aqueous solution"