automated pka output

User 259f5a561c

03-07-2005 21:34:34

I know there was a post at some point in which the person was looking to automate the process of pka calculation, and capture the results in a file. A solution was given, which involved API usage and/or the direct usage of cxcalc. I'm looking to do the same thing, i.e. make a batch file that implements cxcalc and captures the results in a file(s). I know that the pka extension of cxcalc will give pka values as an output, but I also need the data for the pH distribution table, and the chemical make-up of the different species resulting from the analysis.





Is there any way to make cxcalc write these results to a file? I'm not too familiar with coding in Java, so I'd like to be sure that this is possible before trying to tackle plugin API's and such.





Any help would be greatly appreciated.

ChemAxon 7c2d26e5cf

04-07-2005 13:10:56

Nora, who is the expert of this subject is on holiday. She will return tomorrow. She is going to answer your question after she came back.

ChemAxon fb166edcbd

05-07-2005 11:00:51

Currently


Code:
 


cxcalc pka





only returns the pKa values and we have


Code:



cxcalc majorms





to display the major microspecies at a given pH:


Code:



cxcalc majorms -H 3.5 "NCC1CCCC(C1)C(O)=O"


id      major-ms


1       [NH3+]CC1CCCC(C1)C(O)=O





However, we do not have cxcalc access to the complete microspecies distribution as a pH - pecentage table for each microspecies. I can add this if you need - which output format do you prefer? Somehow cxcalc has to list all microspecies with pH-percentage pairs. Is there any other data you need?

User 259f5a561c

05-07-2005 16:44:16

Nora,


Thanks alot for the help so far. It would be great if I could get access to the microspecies distribution as a pH - percentage table. Ideally this would output to a text file, with each major-microspecies followed by its percentage table. I think the only data I would need for the microspecies would be its formula and charge.





Thanks again.














cxcalc majorms -H 3.5 "NCC1CCCC(C1)C(O)=O"


id major-ms


1 [NH3+]CC1CCCC(C1)C(O)=O





However, we do not have cxcalc access to the complete microspecies distribution as a pH - pecentage table for each microspecies. I can add this if you need - which output format do you prefer? Somehow cxcalc has to list all microspecies with pH-percentage pairs. Is there any other data you need?

ChemAxon fb166edcbd

05-07-2005 17:08:31

Since the Marvin 4.0. release is very close now, it seems that I can add this to the next major Marvin 4.1. release only but we can give you a test release as soon as the feature is implemented.

User 259f5a561c

05-07-2005 20:29:23

Thanks.





Is there any way that I can output just the formulas for the various microspecies? Ideally I would like to have the formula for each microspecie and its associated pka value. It seems like I could do this by looping through each pH value with majorms, but if a microspecie is never dominant, it won't be processed.

ChemAxon fb166edcbd

06-07-2005 17:21:03

Currently there is no such possibility either but I will add this simpler version (just to output the microspecies) for Marvin 4.0.

ChemAxon fb166edcbd

06-07-2005 20:20:30

I have added the following extension to majorms calculation:


if no pH is specified then it outputs all microspecies.


By default, in the table form, the microspecies are fused into a single molecule because only one output can be written in one table column:





Code:



cxcalc majorms "NC(CO)C1=CNC=C1C(O)=O"


id      major-ms


1       NC(CO)c1c[nH]cc1C(O)=O.NC(CO)c2c[nH]cc2C([O-])=O.NC(CO)c3c[n-]cc3C(O)=O.NC(CO)c4c[n-]cc4C([O-])=O.NC(C[O-])c5c[nH]cc5C(O)=O.NC(C[O-])c6c[nH]cc6C([O-])=O.NC(C[O-])c7c[n-]cc7C(O)=O.NC(C[O-])c8c[n-]cc8C([O-])=O.[NH3+]C(CO)c9c[nH]cc9C(O)=O.[NH3+]C(CO)c%10c[nH]cc%10C([O-])=O.[NH3+]C(CO)c%11c[n-]cc%11C(O)=O.[NH3+]C(CO)c%12c[n-]cc%12C([O-])=O.[NH3+]C(C[O-])c%13c[nH]cc%13C(O)=O.[NH3+]C(C[O-])c%14c[nH]cc%14C([O-])=O.[NH3+]C(C[O-])c%15c[n-]cc%15C(O)=O.[NH3+]C(C[O-])c%16c[n-]cc%16C([O-])=O








However, if you specify the output format in -f and skip the table header and the molecule ID (cxcalc option: -N hi), you can get a list of all microspecies:





Code:



cxcalc -N ih majorms "NC(CO)C1=CNC=C1C(O)=O" -f smiles


NC(CO)c1c[nH]cc1C(O)=O


NC(CO)c1c[nH]cc1C([O-])=O


[NH3+]C(CO)c1c[nH]cc1C(O)=O


NC(CO)c1c[n-]cc1C(O)=O


NC(C[O-])c1c[nH]cc1C(O)=O


[NH3+]C(CO)c1c[nH]cc1C([O-])=O


NC(CO)c1c[n-]cc1C([O-])=O


NC(C[O-])c1c[nH]cc1C([O-])=O


[NH3+]C(CO)c1c[n-]cc1C(O)=O


[NH3+]C(C[O-])c1c[nH]cc1C(O)=O


NC(C[O-])c1c[n-]cc1C(O)=O


[NH3+]C(CO)c1c[n-]cc1C([O-])=O


[NH3+]C(C[O-])c1c[nH]cc1C([O-])=O


NC(C[O-])c1c[n-]cc1C([O-])=O


[NH3+]C(C[O-])c1c[n-]cc1C(O)=O


[NH3+]C(C[O-])c1c[n-]cc1C([O-])=O








Note, that the microspecies output is standardized (aromatized and dehydrogenized) for technical reasons.





This extension will be available in the next alpha pre-release in 1-2 days. I will write here another comment when it is ready.

User 259f5a561c

18-07-2005 17:37:30

Is there any chance that I can get the percentage distribution values of each microspecies at a particular pH?





Ideally I would like to be able to automate inputting a chemical structure, and have as output each microspecies at a particular pH, with information regarding its chemical formula and charge, and it's percentage distribution.





Maybe this could be in the format of outputting mol files for each separate microspecies, with its percentage value as part of its filename?





Is this possible?





Thanks a ton.

ChemAxon fb166edcbd

19-07-2005 18:24:10

I have implemented this for Marvin 4.0.


I have added a new cxcalc item for this: msdistr.


This will output all microspecies sorted by their distributions at a given pH.


The default output will be sdf:-a (dearomatized SDF), but you can specifiy


the output format in the -f option. The distribution values are stored in SDF


tags (molecule proeprty), the default is DISTR[pH=...] but you can specify


this tag in the -t option. The pH is given in the -H option.


The output is a multiple-molecule SDF/MRV string/file.





Examples:





Code:



cxcalc -N hi msdistr test.mol -H 9.6








Code:



cxcalc -N hi msdistr test.mol -H 9.6 -t RESULT_9.6 -f mrv








The "-N hi" is necessary for skipping table header and ID column -


in order to have pure molecule output.





If you want to include other calculation results in other tags then it is only possible


in two steps, with SDF output:





Code:



cxcalc -N hi -o o.sdf msdistr test.mol -H 9.6


cxcalc -S formula charge o.sdf








or you can pipe the first result into the input of the second calculation:





Code:



cxcalc -N hi msdistr test.mrv -H 9.6 | cxcalc -S formula charge








Note, that for technical reasons, to have SDF output with results in SDF tags


you should specify the -S option right after cxcalc. This is true for or "simple"


calculations. Microspecies and tautomer output is different because they


return more molecules for one input molecule. Therefore in these cases you


shoudl specify -M hi instead to skip the text output table header and ID column,


but you should use the default output otherwise (no -S option). I may change


this not-very-user-friendly behaviour in the future.





The multiple output file issue is more complicated to include in the cxcalc framework.


I hope that the above solution will be satisfactory.





PS: we did not have the pre-release yet as far as I know but Marvin 4.0 is


on the way. I will let you know when there is a release or pre-release


includeing these features.

User 259f5a561c

21-07-2005 05:46:10

Thanks. Sounds great.

ChemAxon fb166edcbd

15-08-2005 19:45:47

Marvin 4.0 is released:


http://www.chemaxon.com/marvin





Marvin 4.0.1 will be released soon.