Tautomers and Conformers of Vitamin C
User 25d107bd42
07052008 12:49:12
Hi,
while evaluating the MarvinSketch tools, I made the tautomer calculation for Vitamin C = L(+)ascorbic acid, see screenshot518. The result was very unexpected. The tautomer 3) we are always formulating got a distribution of 0 %. I didn't browse the literature, but is tautomer 1) really the most important tautomer ? Vitamin C is always formulated as tautomer 3), because of the stabiltiy of the conjugated carbonylenol structure.
To find more information I used the tool conformers and calculated the Dreiding energies for these tautomers. Each calculation took more than one hour ! Please see the next three posts. The calculation for tautomer 4) stopped after 1.5 hours and there was no result.
User 25d107bd42
07052008 12:57:11
Conformers calculation for Vitamin C tautomer 1)
User 25d107bd42
07052008 13:00:23
Conformers calculation for Vitamin C tautomer 2)
User 25d107bd42
07052008 13:03:21
Conformers calculation for Vitamin C tautomer 3)
User 25d107bd42
07052008 13:15:29
Screenshot526 shows the summary of these calculations.
In the tautomer calculation tautomer 1) has the greatest distribution.
In the conformer calculation tautomer 3) has the best energy (27.51 kcal/mol) for the best conformer. This looks better.
But the tautomer calculation refer to water solutions and the Dreiding energy corresponds to vakuum. Is that the reason for the difference ?
Regards, HansUlrich
User 851ac690a0
07052008 14:10:03
Hi,
The stability difference of the tautomers of Vitamin C (in solution) is greatly related with the intermolecular hydrogen bonds. See attached figure.
This hydrogen bond effect is not included in the distribution calculation, therefore, the calculated distribution is not in agreement with the generally accepted formulation of Vitamin C.
Jozsi
User 25d107bd42
07052008 15:07:56
Hi Jozsi,
this is a good explanation. The water solution is important.
And it is interesting the Dreiding Energy calculation gives the result, one is expecting by discussion of substituent effects.
Regards, HansUlrich
PS: Nice to see which version of Marvin you are using ;)
ChemAxon 8b644e6bf4
13052008 15:17:02
Dear HansUlrich,
HUWagner wrote: 
To find more information I used the tool conformers and calculated the Dreiding energies for these tautomers. Each calculation took more than one hour ! Please see the next three posts. The calculation for tautomer 4) stopped after 1.5 hours and there was no result. 
These calculation times even for a full conformational analysis (for these structures) are unacceptable. I will check the underlying cause and post a comment about it here.
regards,
Gabor
User 25d107bd42
20052008 06:39:23
Hi, there are still problems with the conformer calculations.
I used the following script to get the conformer calculation of tautomer 4.
Code: 
Two posts below there is the optimized code.

Applying the script to tautomer 4 "VitaminCT4.mrv" I got after 2 hours the failure output "Out of memory...". In other cases the script works fine and puts the result in the *.sdf file.
My machine is not very fast, but it has 1 GB memory.
How to get more JAVAMemory ?
Regards, HansUlrich
ChemAxon 8b644e6bf4
20052008 16:03:19
Dear HansUlrich,
HUWagner wrote: 
My machine is not very fast, but it has 1 GB memory.
How to get more JAVAMemory ?

You can use the Xmx option to specify Java maximum heap size:
Code: 
cxcalc Xmx640M conformers ...

will let the JVM use at most 640 MBytes heap size.
regards,
Gabor
User 25d107bd42
21052008 18:09:22
Dear Gabor,
thank for your help. Now it works. I optimized the script, see the new code.
Code: 
#!/bin/bash
#
# Calculating conformers
#
echo ""
date
echo "Calculating conformers for $1"
echo "input $1.mrv"
echo "output $1confs.sdf"
time cxcalc Xmx640M conformers \
optimization 3 maxconformers 5000 diversity 0.1 timelimit 90000 \
prehydrogenize true hyperfine true saveconfdesc false \
$1.mrv >$1confs.sdf
echo
ls l $1*
date
echo ""

The results will be shown in the next post.
Regards, HansUlrich
User 25d107bd42
21052008 18:23:51
Dear Gabor,
using this script I recalculated the 4 tautomers of Vitamin C listed above.
The results using the command line procedure are slightly different from the results using the MarvinGUI, see the list eight posts above ("Summary of itamin C tautomers") . The numbers of conformers are different and especially the energies of the "last" (most instable) conformers. Why ?
Regards, HansUlrich
ChemAxon 8b644e6bf4
23052008 12:09:56
Dear HansUlrich,
My first guess would be that the implementation of the "hyperfine" contains bugs (as you noted in an other topic:
http://www.chemaxon.com/forum/viewpost15492.html#15492). Since we are currently rewriting major parts of the Clean3D and the Dreiding force field implementation I think it would not be worth tracking down the cause in the current (released) implementation. As soon as our test shows that the development version is free of apparent bugs i will look for the possibility to provide it to you before its release.
regards,
Gabor
User 25d107bd42
23052008 13:13:49
Dear Gabor, OK. I would like to contribute by testing. Regards, HansUlrich
User 25d107bd42
03062008 18:56:36
Hi Gabor,
I tested the options 0=loose, 1=normal, 2=strict, 3=very strict using the script above and only changing the value after "optimization". The results are shown in the screenshot attached.
1) The calculations with the options "loose" and "normal" didn't find an end. I must kill them using CTRLC after 2  6 hours.
2) The calculations with the options "strict" and "very strict" do not differ essentially and the calculation times are not very different.
3) Interestingly the numbers of conformers are a little bit higher for the option "strict". But the question is, how different are the conformers really ?
So it's now the question, why to use the options 0 and 1 ?
Does ist make sense to have these options at all ?
(BTW: I have a new machine, which is faster than the machine in the tests above, as you can see comparing the results in this post and in the posts above. Other consequences given by the hardware and software changes will be discussed in a new topic).
Regards, HansUlrich
ChemAxon 8b644e6bf4
04062008 19:59:47
Dear HansUlrich,
Thanks for the detailed measurements. I still have not tested why it took so long exactly. About the optimization limit: this determines the magnitude of the remaining gradient where the geometry optimization process should stop. Setting this is a trade of time vs quality of the generated structure in most cases.
In the last few days i have looked into the hyperfine and it seems to be free of hard bugs. However setting up correct parameters is crucial: when the above mentioned optimization criteria is loose and the diversity limit is low then  after the MD runs  the optimizer might stop (because of the loose limit) before getting close enough (within the specified diversity limit) to the starting conformers representing true local energy minimum. This can result in "endless" number of generated conformers (which are around true local energy minimums).
After a few tests i would suggest that the diversity limit should be raised to at least 0.15 or even higher (0.20.4) and only the "very strict" optimization limit should be used. In the next release probably we will use the very strict limit only in hyperfine. (MD temperature probably also will be cooled and the minimal diversity limit will be elevated.)
So, concluding the above, only the "very strict" optimization criteria is reasonable with setting up little higher diversity limit.
If you have any further questions, do not hesitate to ask them. In the near future i will do a detailed test and post the results.
Regards,
Gabor
User 25d107bd42
13062008 21:45:20
Hi Gabor, I have done further tests.
With some hints of our SysAdmin
s now I use the following optimized script : (BTW: UNIX/Linux is
the optimal operating system :)
Code: 
#!/bin/bash
#
# script file conformers
#
# starting with f.e. ./conformers ethanol >>ethanol.txt
#
# *appends* the calculation parameters to ethanol.txt
# *appends* the calculation output to $1confs.sdf
#
echo ""
date
echo "Calculating conformers for $1"
echo "input $1.mrv"
echo "output $1confs.sdf"
(time cxcalc Xmx640M conformers \
format sdf maxconformers 5000 diversity 0.1 saveconfdesc false \
hyperfine true prehydrogenize true timelimit 90000 optimization 3 \
$1.mrv >>$1confs.sdf ;) 2>&1
date
ls l $1*
echo ""

I have done 2 times the same calculation 3 times ( = 6 conformer calculations) :
1) Three times the same: conformation optimization = 1 on ethanol. See the results in the screenshot on the left side.
2) Three times the same: conformation optimization = 3 on ethanol. See the results in the screenshot on the right side.
(I hope the screenshots would be readable, but I didn't find the option to enlarge the font size).
Now I have problems.
a) The results in the three optimization runs should be the numerically the same ! The results for equal conformers (f.e no. 1, 4 and 7, the "best" transoidconformer of ethanol) must be numerically equal. But there are differences in the fourth digit for opt = 1 and the sixth digit for opt = 3. Doing the same thing gives numerically different results ? I don't understand it !
b) To use more than two digits accuracy does not make any sense. So the options "hyperfine=false" and optimization limits lower than 3 are inacceptable.
To get poorer results in dependance of the calculation time is also inacceptable.
The options "optimization limit" and "hyperfine" could be omitted.
Regards, HansUlrich.
User 25d107bd42
18062008 06:37:47
Hi, one question more:
Does your JAVA code use float or double variables ?
Regards, HansUlrich
ChemAxon 8b644e6bf4
19062008 02:19:26
Dear HansUlrich,
Quote: 
a) The results in the three optimization runs should be the numerically the same ! 
I would like to argue with the above since using "hyperfine" all of the generated conformers are passed through an MD run and a subsequent MM optimization: this  in case of proper operation  definitely should result in slightly different conformers. (In this case numerically equal energies could be indicate a bug!) To be more specific: MD runs are initialized using random starting velocities, so ideally after the runs our conformers are in the neighborhood (and not at in) local minima. The subsequent MM optimization starts from these structures and optimize them until a limit in the gradient reached. So the optimized structures will be in the close vicinity of the closest local energy minimum. Quote: 
b) To use more than two digits accuracy does not make any sense. So the options "hyperfine=false" and optimization limits lower than 3 are inacceptable. 
Indeed, the conformer selection dialog in the GUI uses a reduced precision already. However, there can be use cases demanding full precision of the calculated energy (consider using energy as a geometrydependent hash), so it should be stored in the ENERGY property with full resolution. Hyperfine=false and possibly looser optimization criteria is reasonable for sampling conformational space of flexible molecules required. Quote: 
The options "optimization limit" and "hyperfine" could be omitted.

I think the hyperfine  havng correct default parameters  should not be excluded since it is the tool for eliminating false energy minimum situations. The optimization limit is also a useful parameter in certain situations. However, hyperfine should be used only with the strictest (3) optimization limit as i wrote earlier. (In 5.1 hyperfine automatically uses this limit.) (MD parameters  temperature and step count  will also be tuneable in the future. The third important parameter, the diversity limit having 0.1 default value also seems to be too sharp, this probably also will be loosen) Quote: 
Does your JAVA code use float or double variables ? 
We use double variables through the Clean3D and related codes.
regards,
Gabor
User 25d107bd42
19062008 08:37:48
Hi Gabor,
thank you for your detailed answer. Now I have to think over it.
And I am eagerly awaiting Marvin 5.1 ;)
Regards, HansUlrich
User 25d107bd42
30062008 09:49:50
Hi Gabor,
I made some more test calculcations and each calculation gives different results, see the screenshot bf0031. The nonplanar structure sholuld not be a minimum and it should not have a lower energy than the planar molecule (The planar molecule has always 246.74 kcal/mol). I think we are at the limit of the accuracy of the calculations.
As already mentioned above, the results are not reproducable, obviously coming from the random starting velocities.
Regards, HansUlrich
ChemAxon 8b644e6bf4
30062008 22:01:53
Dear HansUlrich,
It seems that the force field does not recognize the cyclopropene "triangle", so the assigned equilibrium bond angles around the SP2 carbons produce a nonplanar energy minimum. (Fixing this is already scheduled.) The fragment builder however generates the correct, planar structure, which represents a false energy minimum (according to the force field) which can not be broken by a single optimization. Hyperfine supposed to handles these situations (in this case, revealing a bug in the FF) by "disturbing" the structure with short MD runs as discussed earlier.
In the 5.0 version it seems that the MD itself using the "very strict" optimization limit has enough initial temperature/step count to reveal this false energy minimum for most of the times. When the diversity limit is increased enough to make the two conformers equivalent the faulty one having lower energy will be kept.
I suppose that the the different energies on the screenshot are resulted by a looser optimization limit; a quick test gave a bit more consistent results for me.
Regards,
Gabor