FCFP4 Tanimoto similarity of JCHEM and Pipeline Pilot

User abcdea6f24

10-04-2014 14:54:45

I have two molecules:


CCCC(=O)OC(C)(CCC=C(C)C)C=C


CC(C)=CCCC(C)(OC(C)=O)C=C


The Tanimoto Similarity between these two molecules is 0.81 from Pipeline Pilot (FCFP_4), while the number is 0.35 by using JChem (FCFP4). The configure file is attached.


 


Can anyone tell me what is the reason or how can I modify the configure file?

ChemAxon d51151248d

12-05-2014 11:41:32

Hi, 


unfortunately we couldn't reproduce the calculated similarity values. Would you please send us the exported Pipeline Pilot workflow and share the JCHEM FCFP fingerprint calculation options with us ? Then we could investigate the issue and send a proper setting to you.


Thanks, 


Daniel

User abcdea6f24

11-08-2014 06:36:27

Thanks for replying. The PP protocol is attached.

ChemAxon 5fc3e8d7d0

27-08-2014 12:39:56

Dear Christen,


Regarding the FCFP fingerprint similarity calculation there are two important things:

1. I think the implementations of FCFP feature recognition in the Pipeline Pilot and JChem are possibly different. For this reason we cannot guarantee the exactly same result in each case.

2. The FCFP configurations are very different in the Pipeline Pilot and the JChem. The default JChem FCFP features defined by Chemical Terms are:


<Property Name="HydrogenBondAcceptor" Value="1">acceptor()</Property>
<Property Name="HydrogenBondDonor" Value="1">donor()</Property>
<Property Name="Aromatic" Value="1">arom()</Property>
<Property Name="Charge" Value="1"><![CDATA[ charge() * 10 ]]></Property>

The PP documentation contains the default FCFP features, there are six of them: IsAcceptor, IsDonor, IsNegativeIonizable, IsPositiveIonizable, IsAromatic, IsHalogen.

I created (and attached) a similar configuration:


<Property Name="HydrogenBondAcceptor" Value="1">acceptor()</Property>
<Property Name="HydrogenBondDonor" Value="1">donor()</Property>
<Property Name="Aromatic" Value="1">arom()</Property>
<Property Name="NegativeIonizable" Value="1"><![CDATA[ formalCharge() < 0 ]]></Property>
<Property Name="PositiveIonizable" Value="1"><![CDATA[ formalCharge() > 0 ]]></Property>
<Property Name="Halogen" Value="1">(atno() == 9) || (atno() == 17) || (atno() == 35) || (atno() == 53)</Property>

Try to use this file. Tests were run with this configuration and we get very similar results as with Pipeline Pilot.


If you would like to know more about ChemAxon ECFP/FCFP fingerprints, see the following page: https://docs.chemaxon.com/display/CD/ECFP+%28Extended+Connectivity%29+fingerprint


 


Best regards,
Laszlo