Distribution % missing when running cxcalc

User 0b9f745785

22-09-2015 09:08:34

I'm attempting to generate dominant tautomers using cxcalc dominanttautomerdistribution, and I've noticed for some molecules, the distribution % (which should be in the third column) is not written to stdout. Below is an example command and the corresponding output:



$ cxcalc "CC#CC(=O)C1=CCN(C)CC1 CHEMBL154385" dominanttautomerdistribution -H 7.4 -C false -t dist -f "smiles:n,T:dist"

#SMILES name    dist

CN1CCC(=CC1)C(O)=C=C=C  CHEMBL154385    

C[NH+]1CCC(=CC1)C(O)=C=C=C  CHEMBL154385    

CN1CCC(=CC1)C([O-])=C=C=C   CHEMBL154385    

C[NH+]1CCC(=CC1)C([O-])=C=C=C   CHEMBL154385    

CN1CCC(=CC1)C(=O)C=C=C  CHEMBL154385    

C[NH+]1CCC(=CC1)C(=O)C=C=C  CHEMBL154385    

CC#CC(=O)C1=CCN(C)CC1   CHEMBL154385    

CC#CC(=O)C1=CC[NH+](C)CC1   CHEMBL154385    

CN1CCC(=CC1)C(=O)C#C[CH2-]  CHEMBL154385    

C[NH+]1CCC(=CC1)C(=O)C#C[CH2-]  CHEMBL154385    



Can anyone shed some light on what the cause of this problem is, if it is indeed an error or, if not, how the absence of a distribution % should be interpreted?

ChemAxon d51151248d

22-09-2015 12:18:02

Hi, 


Thank you for posting this. It has been confirmed as a bug. We will provide a fix for it that will be available in the next week's release.


Daniel

User 0b9f745785

22-09-2015 21:50:08

Thank you! Another subset of SMILES run with the same command fail with a confusing error message. Could you shed some light on the cause of this error? Below is an example:


$ cxcalc "[3H]c1ccc(S(=O)(=O)NC(=O)c2ccc(Cn3c(CNc4ccccc4C(=O)O)cnc3CCCC)cc2)c(Cl)c1 CHEMBL2111973" --ignore-error dominanttautomerdistribution -H 7.4 -C false -t dist -f "smiles:n,T:dist"
#SMILES name    dist
[3H]c1ccc(c(Cl)c1)S(=O)(=O)[#7]-[#6](=O)-c1ccc(-[#6]-n2c(-[#6]-[#7]-c3ccccc3-[#6](-[#8])=O)cnc2-[#6]-[#6]-[#6]-[#6])cc1 1 "java.lang.ArrayIndexOutOfBoundsException: 40\n at chemaxon.calculations.CanonicTautomer.calcTautomerScore(CanonicTautomer.java:3094)\n at chemaxon.calculations.CanonicTautomer.calcTautomerScore(CanonicTautomer.java:3008)\n at chemaxon.calculations.Tautomerization.calcTautomerDistr(Tautomerization.java:18864)\n at chemaxon.calculations.Tautomerization.calculateDACouples(Tautomerization.java:12717)\n at chemaxon.calculations.Tautomerization.createDACouples(Tautomerization.java:12564)\n at chemaxon.calculations.Tautomerization.calcDominantTautomerDistribution(Tautomerization.java:18802)\n at chemaxon.marvin.calculations.TautomerizationPlugin.run(TautomerizationPlugin.java:862)\n at chemaxon.marvin.plugin.concurrent.PluginWorkUnit.call(PluginWorkUnit.java:91)\n at chemaxon.marvin.plugin.concurrent.ReusablePluginWorkUnit.call(ReusablePluginWorkUnit.java:65)\n at chemaxon.util.concurrent.marvin.CompositeWorkUnit.call(CompositeWorkUnit.java:73)\n at chemaxon.util.concurrent.processors.SingleThreadedProcessor.getNext(SingleThreadedProcessor.java:70)\n at chemaxon.marvin.Calculator.run(Calculator.java:1499)\n at chemaxon.marvin.Calculator.run(Calculator.java:1400)\n at chemaxon.marvin.Calculator.main(Calculator.java:2078)\n"
dominanttautomerdistribution:FAILED

ChemAxon d51151248d

24-09-2015 08:41:11

Hi, 


Thank you again. This also has been confirmed as a bug. The fix will be available soon. Did you find other bugs?


I would really appreciate if you sent us other molecules that showed error during tautomer generation.


Daniel

User 0b9f745785

24-09-2015 19:39:22

You're welcome! I believe these are the only potential bugs I encountered. Attached are 67 smiles for which cxcalc died with errors during tautomerization (cxcalc_error_smiles.smi, the second bug).


Will the fix to the second bug be available in next week's release as well? Thank you for your prompt replies!


Seth

User 0b9f745785

24-09-2015 23:53:16

Actually, as I was looking through some of the tautomers, I did notice something odd. Namely, if I run the following command, it generates a tautomer with a valency of 5 on a nitrogen.


$ cxcalc "CCCCc1c[n+](Cc2ccc(-c3ccccc3C3=[NH2+2][N-]N=N3)cc2)cn1Cc1ccc(-c2ccccc2-c2nn[n-]n2)cc1 CHEMBL2337687" dominanttautomerdistribution -H 7.4 -C false -t dist -f smiles:n,T:dist
#SMILES name dist
CCCCC1=C[N+](CC2=CC=C(C=C2)C2=CC=CC=C2C2=[NH2]NN=N2)=CN1CC1=CC=C(C=C1)C1=CC=CC=C1C1=NNN=N1 CHEMBL2337687 71


This doesn't seem right. Is there a reason it produces a nitrogen with valency of 5?


Seth

ChemAxon d51151248d

25-09-2015 11:52:51

Hi Seth, 


We have already fixed both bugs, but due to testing they will most probably come out in 2 weeks. 


Regarding the question about the molecule with a N atom of valency of 5, the input structure itself contains such a N atom. See the attached picture that shows the molecule on the MarvinSketch canvas with the N with a valency label v5. So is your question why we keep that valence of 5?


We will look at the other structures in the attached SMILES as well. 


Thank you, 


Daniel

User 0b9f745785

22-10-2015 20:42:14

Hi Daniel,


   Ah yes, that was my mistake. Thanks for pointing that out.


   Has the version of cxcalc with these bugs fixed been released yet?


   Seth

ChemAxon d51151248d

26-10-2015 11:56:08

Hi, 


Yes, the relevant bugfixes are already in this week's release. That is 15.10.26.


I suggest that you update your current version. 


Daniel

ChemAxon d51151248d

26-10-2015 13:47:12

Hi again, 


We will provide another bugfix for the dominant tautomer generation of the molecules with P in your attached SMILES file. We have found further bugs now while testing for all molecules in that file.


So your best choice would be to wait until next week for this fix. However, the fixes for your original bugs you posted are already available.


Daniel