CFTanimoto Values using JChem for Excel

User 49a17482ef

15-01-2013 16:01:25

Hello,


I am an undergraduate student who is doing a senior thesis project on structural similarity measures, specifically the Tanimoto coefficient. I am using both JChem for Excel and ChemMine to generate Tanimoto values for some molecules that are cannabinoid receptor agonists. When I cross reference the Tanimoto values generated by ChemMine (AP or Atom Pair Tanimoto) and the Dissimilarity CFTanimoto (I subtract the CFTanimoto from one so I'm comparing similarity to similarity), some values are close (differing by only 0.02 or so), while others differ by 0.5 or more. Specifically, when comparing the structures of JWH-018 and UR-144, the Tanimoto generated by your software is 0.74, while ChemMine's Tanimoto value is 0.26.Logically, the Tanimoto value should be somewhere in the middle.


Why are these differences so inconsistant? Both software packages use a bit-string to calculate Tanimoto values, so is one program more extensive in what it compares? Does your software weigh certain parts of a molecule more than others? Is there another measurement I should be using that will be more comparable to an Atom Pair Tanimoto value, or that would be more appropriate for my research? I need a Tanimoto value that will be consistant, so it shouldn't give the option of weighing an MCS or other part of a molecule.


Any help you could provide me would be greatly appreciated.

ChemAxon 5fc3e8d7d0

16-01-2013 12:57:12

 




Hi,


we do not know how the ChemMine algorithm works, but if the algorithms for calculating the fingerprints are different, you might expect different results also.


You can find additional description at the following links:


http://www.chemaxon.com/jchem/doc/user/ScreenMD.html />http://www.chemaxon.com/jchem/doc/user/fingerprint.html


 


Best regards,


Laszlo

User 49a17482ef

16-01-2013 20:26:41

Thank you very much for those links, they helped me narrow in on what I think may be the cause of the discrepancy between the values. I just have two follow up questions now:


1. Is there a way to tell if/how much bit collisions are affecting your CFTanimoto values?


2. Would increasing the number of bit sets per pattern artificially inflate the Tanimoto similarity values for two similar molecules?


Thank you again for your help.

ChemAxon 5fc3e8d7d0

27-01-2013 19:49:10

1. You can not calculate fingerprint values from similarity.
2. Similarities calculated using fingerprints that do not have the same (bit) length will not give comparable results.


In case your test molecules are public and if you would like to try other fingerprints provided by ChemAxon (e.g : ECFP) or if You have any question, please feel free to contact us.


 


Best regards,


Laszlo