Scalar descriptors in Ward.

User 6628eeb49e

28-02-2008 05:19:59

Hi ..

i have tried Jklustor module..

In ward help page its mentioned that "The Ward application uses Ward's minimum variance method for clustering molecules based on molecular fingerprints or other descriptors. "..

So i tried the same for LogD, Mass, Heavy Atoms, HBD, HBA etc...

ru generatemd with -D option also.(See attached file).

then i run ward program as follows:

ward -f 512 -c 100 -x -y -g <Testdata\input\output\new_logd.logd > Testdata\input\output\cluster_mol2.txt

it showing the error, "Error: For input string: "2.9977""

Can You please help me...??

ChemAxon efa1591b5a

28-02-2008 08:52:56


as you are not using a fingerprint in the clustering, the fingerprint length should be specified 0, that is, -f 0. Meanwhile, you are using a 1 dimensional float descriptor vector, so you need to put -m 1 in the command line to specify the dimensionality of your data.

Does this help?


User 6dd863a614

29-02-2008 09:14:07

As u said....i have tried this...

ward -f 0 -m 1 -g -c 25 -x -y -Z <Testdata\inpt\output\new_logd.txt >Testdata\input\output\cluster_logd.txt

With less no of molecules (16000 molecules) , it worked fine.

For entire data, now its showing follo: error,

"Unknown error


at java.util.Vector.firstElement(Unknown Source)

at chemaxon.clustering.ACompoundInTheSpace.getFirstNeighbour(ACompoundInTh

at chemaxon.clustering.Ward.RNN(


at chemaxon.clustering.Ward.main("

Is there any size limit for the program..?same no of molecules workied for other descriotors like fingerprints??

What Could be the reason?can you plz help me?

ChemAxon efa1591b5a

03-03-2008 08:32:06

There should be no theoretical limitation, though practical limitation is some 10K due to Ward's time complexity (which are quadratic). This sounds like a bug, I assign it to the developer of our Ward-Murtagh implementation.

I expect a bug fixe in the next release of JChem (version 5.0.2, release date is not yet scheduled).



User 6dd863a614

03-03-2008 10:43:12

Thanks for ur Quick rply./...

it seems to be a problem only for LogD, coz with same file Logp works well...

but for Mass, Hdon and H acc it shows the same error,

Error: For input string: "448.33463"

since i have specified the fingerprint size (0) and dimensionality(as 1).

then what could be the other problem...???

ChemAxon efa1591b5a

04-03-2008 10:46:11

I did not manage to reproduce this bug, though I tried fairly large inputs. Could you attach your 'new_logd.txt' file (the large one, that fails) to your post? That would be most helpful for us to trace back this bug.



User 6dd863a614

07-03-2008 06:18:52

i am sorry..tht was a mistake in the command..

now its working fine except for logd.

Thank u very much for ur attention to the queries..once again..

ChemAxon efa1591b5a

11-03-2008 08:46:14

Thanks for your response. We will check the logD file you attached.