User 0a45234d6e
20-09-2012 19:13:54
Hi,
I found the results of ECFP generated via chemaxon are different with those in pipeline pilot. I guess it has something to do with the configuration file.. please help figure it out.
e.g., the input molecule is "CCC", the results from chemaxon are: -887929887
-557513035
-194534908
887303108
from Piple line are:
ECFP_4
734603939
1559650422
863188371
-952661137
The total number of descriptors are the same, but the coding of descriptor is quite different...Thanks.
ChemAxon 4a2fc68cd1
20-09-2012 21:29:44
Hi,
The concept of ECFP is the same in ChemAxon's implementation and in Pipeline Pilot, but different hash functions are used internally in the generator algorithms. Therefore, the number of generated ECFP features will typically be the same, but the "coding" of the descriptor (the actual integer identifiers) will definitely be different even if exactly the same atom properties are considered in the two implementations. Note that the applied hash functions are not public, so there is no way to produce the same results.
Anyway, the default configuration of our ECFP implementation corresponds to the "standard" defined by the original authors of this method. (For more information, see our user guide and the original paper introducing ECFPs.) Pipeline Pilot may also apply equivalent configuration, but I could not find information about such details.
Best regards,
Peter
User 0a45234d6e
20-09-2012 21:32:41
That's what I thought. Thanks for the clarifcation.
User 0a45234d6e
24-09-2012 02:44:16
Hi, one following question regarding ECFP fingerprint. I wonder if it is able to get its substructure (e.g. SMART) based on the hashcode.. or generate fingerprints with substructures that can be visualized later on.
Thanks.
ChemAxon 4a2fc68cd1
25-09-2012 20:55:02
User 0a45234d6e
25-09-2012 21:02:01
Awesome... Thanks very much.