I found the results of ECFP generated via chemaxon are different with those in pipeline pilot. I guess it has something to do with the configuration file.. please help figure it out.
e.g., the input molecule is "CCC", the results from chemaxon are: -887929887
from Piple line are:
The total number of descriptors are the same, but the coding of descriptor is quite different...Thanks.
The concept of ECFP is the same in ChemAxon's implementation and in Pipeline Pilot, but different hash functions are used internally in the generator algorithms. Therefore, the number of generated ECFP features will typically be the same, but the "coding" of the descriptor (the actual integer identifiers) will definitely be different even if exactly the same atom properties are considered in the two implementations. Note that the applied hash functions are not public, so there is no way to produce the same results.
Anyway, the default configuration of our ECFP implementation corresponds to the "standard" defined by the original authors of this method. (For more information, see our user guide and the original paper introducing ECFPs.) Pipeline Pilot may also apply equivalent configuration, but I could not find information about such details.
That's what I thought. Thanks for the clarifcation.
Hi, one following question regarding ECFP fingerprint. I wonder if it is able to get its substructure (e.g. SMART) based on the hashcode.. or generate fingerprints with substructures that can be visualized later on.
Awesome... Thanks very much.