User 9df74a15a4
05-11-2014 12:34:59
I'm working under Win7 (with Powershell), not Unix, so some of the example lines in the documentation don't seem to work (for me), plus I am not really a cmd line person. Also (or because of the latter) I find the options kind of confusion, esp for Jarvis-Patrick where there is a mix of jarp and generatemd; in other words, I could use some more basic guidance (not in the form of of - check the documentation; btw, the links in the sticky "JKluster, Screen...." don't work).
Anyway, what I have and like to do (aside from all GUI....):
An sdf or txt (smiles) file containg molecule and ID, no attached data (necessarily).
Cluster the contents according to a Tanimoto similarity (say 0.6) of Fingerprints; not pharmacophore or single reference compound based.
This means a Jarvis-Patrick method(?). An example from the documentation looks like it could do the trick
generatemd c input.smi -c CF -k cfp.xml -D | jarp -f 512 -t 0.1 -c 0.3 -g
but: cfp.xml (from the example folder) contains stuff not suitable for my case. Replacing with "-f 1024" (0r 512) gives an error. Also, why is it -D if jarp works with binary FPs?
Maybe I am not even looking at the correct example? Also, once this works, I guess the work-up is the next step, there are several not so obvious examples in the documentation - which might become more obvious once you can get there to play around with?). The way I would like to have the output is preferably one output file, but in the worst case a file per cluster (which though would suck if you have a large set and end up with 20+ Clusters and singletons) that can be viewed with Marvin showing the seeds of a cluster and the compounds of each cluster.
Thanks in advance.