cxcalc batch creates random exactmass in v5.2.4

User 677b9c22ff

05-12-2009 23:23:59

Hi,


just came across this in jchem.version=5.2.4 WINXP as batch using cxcalc.


C:\XXX>cxcalc formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      340.233629858
2       C4H6N4O3S2      225.009673306
3       C34H47NO11      656.393898127
4       C22H24N2O2      350.198139254
5       C20H26N2O2      328.213789318
6       C17H13ClN4      308.082874143
7       C15H23NO2       251.187240217
8       C10H17N 151.136099549
9       C10H16N2O3S     247.109704926
10      C6H8ClN7O       230.055066236
11      C15H12ClN3O     286.074070344
12      C15H13N3O       252.113042669
13      C13H17N3O       232.144342797



The problem is the exactmass is not correct for these formulae.


MarvinView calculates them correctly, also JCHEM EXCEL.


NUM  Formula OK    ExactMass OK    DIFF
1    C18H28N2O4    336.2049074    -4.028722464
2    C4H6N4O3S2    221.9881315    -3.021541848
3    C34H47NO11    645.3149114    -11.07898678
4    C22H24N2O2    348.183778    -2.014361232
5    C20H26N2O2    326.1994281    -2.014361232
6    C17H13ClN4    308.0828741    0
7    C15H23NO2    249.172879    -2.014361232


See attached file quak-quak-quak.smi.


Reminds me not only developers should always have testunits, but also


users should always use a validation set to monitor changes in algorithms etc.


For example if I run the same file several times I always get different results


Not sure if this is true for all the other parameters logp, logd, pka, wiener.




C:\XXX>cxcalc formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      340.233629858
2       C4H6N4O3S2      225.009673306
3       C34H47NO11      656.393898127
4       C22H24N2O2      350.198139254
5       C20H26N2O2      328.213789318
6       C17H13ClN4      308.082874143
7       C15H23NO2       251.187240217
8       C10H17N 151.136099549
9       C10H16N2O3S     247.109704926
10      C6H8ClN7O       230.055066236
11      C15H12ClN3O     286.074070344
12      C15H13N3O       252.113042669
13      C13H17N3O       232.144342797
C:\XXX>cxcalc formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      336.204907394
2       C4H6N4O3S2      221.988131458
3       C34H47NO11      645.314911351
4       C22H24N2O2      348.183778022
5       C20H26N2O2      326.199428086
6       C17H13ClN4      308.082874143
7       C15H23NO2       249.172878985
8       C10H17N 151.136099549
9       C10H16N2O3S     244.088163078
10      C6H8ClN7O       229.047885620
11      C15H12ClN3O     285.066889728
12      C15H13N3O       251.105862053
13      C13H17N3O       231.137162181
C:\XXX>

 


I am pretty sure that is a serious bug, because also with other files


the same happens.I just dont know how and why this could happen.


Tobias

ChemAxon e08c317633

07-12-2009 22:03:11

Hi Tobias,


It seems to be a concurrency issue (cxcalc utilizes all CPU cores for the calculation). Please use the "-s" cxcalc switch (undocumented) until we fix this error. It will run the calculations in single mode (on one CPU core).


Example:


$ cxcalc -s formula exactmass quak-quak-quak.smi
id Formula Exact mass
1 C18H28N2O4 336,204907394
2 C4H6N4O3S2 221,988131458
3 C34H47NO11 645,314911351
4 C22H24N2O2 348,183778022
5 C20H26N2O2 326,199428086
6 C17H13ClN4 308,082874143
7 C15H23NO2 249,172878985
8 C10H17N 151,136099549
9 C10H16N2O3S 244,088163078
10 C6H8ClN7O 229,047885620
11 C15H12ClN3O 285,066889728
12 C15H13N3O 251,105862053
13 C13H17N3O 231,137162181


Only cxcalc runs the ElementalAnalyserPlugin calculations concurrently, that's why MarvinSketch/View and JChem for Excel return the correct results.


We tried to reproduce this error with other calculations (pKa, logP, PSA, etc.) using Marvin 5.2.4, but these seems to work fine. Also, with Marvin 5.3 (will be released soon) this exactmass bug is not reproducible.


We will get back to you when we find out more about this bug.


Zsolt

User 677b9c22ff

07-12-2009 23:32:24

Hi Zsolt,


oohhh the wonders of concurrency. Thanks for the upcoming fix, its pretty clear though that 4,8,12,16, 80 cores will be the future, so dealing with  software threading issues at an early stage is clearly an advantage. How hard is it to convince people that instead of waiting one hour they only have to wait 2 minutes when using a 32 thread SMP setup. :-)


 


Allthough I have to say the -s switch does not do anything for exactmass in the older 5.2.2 version, its neither faster nor slower and using the same number of threads (dont know how well they are utilized).


 


However in case of logP which seems to be better parallelized it does soemthing, CPU use is lower with -s and also the thread count drops from 12 to 10.


 


I also tried to reproduce with the older version 5.2.2 but that version seems fine, could be also a JAVA version issue.


Cheers


Tobias

User 677b9c22ff

04-01-2010 16:24:24

Hi,


I just came across another weird behaviour, after using the safe switch (single threaded) cxcalc -s for around 10 times, cxcalc was spitting out integer values, when the -s switch was not used anymore.




C:\temp>cxcalc -s formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      336.204907394
2       C4H6N4O3S2      221.988131458
3       C34H47NO11      645.314911351
4       C22H24N2O2      348.183778022
5       C20H26N2O2      326.199428086
6       C17H13ClN4      308.082874143
7       C15H23NO2       249.172878985
8       C10H17N 151.136099549
9       C10H16N2O3S     244.088163078
10      C6H8ClN7O       229.047885620
11      C15H12ClN3O     285.066889728
12      C15H13N3O       251.105862053
13      C13H17N3O       231.137162181
C:\temp>cxcalc  formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      308
2       C4H6N4O3S2      216
3       C34H47NO11      598
4       C22H24N2O2      324
5       C20H26N2O2      300
6       C17H13ClN4      295
7       C15H23NO2       226
8       C10H17N 134
9       C10H16N2O3S     228
10      C6H8ClN7O       221
11      C15H12ClN3O     273
12      C15H13N3O       238
13      C13H17N3O       214

This is still the old version (jchem.vernum=5.2.4, WINXP32) just to make sure that the commandline use is ok in the next release because I think its quite powerful and I use it quite often.


 


Thanks


Tobias

User 677b9c22ff

04-01-2010 16:34:29

Hi,


I just came across another weird behaviour, after using the safe switch (single threaded) cxcalc -s for around 10 times, cxcalc was spitting out integer values, when the -s switch was not used anymore.




C:\temp>cxcalc -s formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      336.204907394
2       C4H6N4O3S2      221.988131458
3       C34H47NO11      645.314911351
4       C22H24N2O2      348.183778022
5       C20H26N2O2      326.199428086
6       C17H13ClN4      308.082874143
7       C15H23NO2       249.172878985
8       C10H17N 151.136099549
9       C10H16N2O3S     244.088163078
10      C6H8ClN7O       229.047885620
11      C15H12ClN3O     285.066889728
12      C15H13N3O       251.105862053
13      C13H17N3O       231.137162181
C:\temp>cxcalc  formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      308
2       C4H6N4O3S2      216
3       C34H47NO11      598
4       C22H24N2O2      324
5       C20H26N2O2      300
6       C17H13ClN4      295
7       C15H23NO2       226
8       C10H17N 134
9       C10H16N2O3S     228
10      C6H8ClN7O       221
11      C15H12ClN3O     273
12      C15H13N3O       238
13      C13H17N3O       214

This is still the old version (jchem.vernum=5.2.4, WINXP32) just to make sure that the commandline use is ok in the next release because I think its quite powerful and I use it quite often.


java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) Client VM (build 11.0-b16, mixed mode, sharing)


In case of the JAVA server version, there is also sometimes a digit cut off (see 2nd) and the values are false. With -server -s there seems to be no problem




C:\temp>cxcalc -server formula exactmass quak-quak-quak.smi
id      Formula Exact mass
1       C18H28N2O4      336.204907394
2       C4H6N4O3S2      222.174
3       C34H47NO11      645.314911351
4       C22H24N2O2      348.183778022
5       C20H26N2O2      326.199428086
6       C17H13ClN4      308.082874143
7       C15H23NO2       249.172878985
8       C10H17N 151.136099549
9       C10H16N2O3S     244.088163078
10      C6H8ClN7O       229.047885620
11      C15H12ClN3O     285.066889728
12      C15H13N3O       251.105862053
13      C13H17N3O       231.137162181

Thanks


Tobias

ChemAxon e08c317633

07-01-2010 15:05:18

Hi Tobias,


Thanks for further information.


To fix this error we will internally disabled concurrent execution of  ElementalAnalyserPlugin calculations in Marvin 5.2.7 (next Marvin patch release). ElementalAnalyser calculations are fast, so speed decrease should not be relevant. In Marvin 5.2.0 - 5.2.6 versions the "-s" cxcalc switch workaround should be used, as described in previous posts.


Marvin 5.3 (coming) is not affected by this bug, so in that version concurrent execution will be enabled.


Zsolt