Optimizing generateMD parameters

User 204415f4a4

31-08-2005 15:33:12

Hello all,





To optimise generateMD parameters for CF generation and using a molecular set of 33703 molecules, I modified the three parameters : fingerprint length, maximum number of bonds in patterns, and maximum number of bits to set for each pattern. The obtained average and maximum "darkness" are as follows:





FP length------Max.#bonds------Max.#bits------Aver. darkness------Max. Darkness





512------7--------3------65.0------98.0


512------7--------4------78.8------99.4


512------7--------5------81.9------99.4





512------8--------3------72.3------99.4


512------8--------4------84.4------99.4


512------8--------5------86.9------99.4





1024----7--------3------43.1------88.5


1024----7--------4------57.7------98.4


1024----7--------5------61.9------98.5





1024----8--------3------50.8------99.1


1024----8--------4------65.7------99.7


1024----8--------5------69.6------99.7





with such results, which parameters would you choose for generating CFs as input for Compr self-dissimilarity analysis and Jarvis-Patrick clustering?





Thanks for your time,





Best regards,





IsI

ChemAxon efa1591b5a

01-09-2005 14:36:01

Hi IsI,





darkness should be between 40 and 60%. Our typical parameter set up for similarity calculations is 1024, 7, 3. If darkness does not exceed 60 with these values then just use them.





If clustering results do not justify these settings try to increase the fingerprint size, e.g. go up to 2048. Apparently, 7 and 3 need also be increased then. Also note, that the use of longer fingerprint does not only consume more memory but also increases the running time.





Regards,


Miklos