Technical Support Forum Index
Technical Support Forum
Access ChemAxon scientists and developers here. For registration and login issues contact website support.

Support Ticket System is replacing forum

This forum was converted into a searchable archive. You cannot add posts here any more. For support please use our new Ticket System.

Create your first ticket
Tanimoto similarity between two compounds
To watch this topic for replies  Register (enables digests) or give email address:
This topic is locked: you cannot edit posts or make replies.
Display posts from previous:   
    View previous topic :: View next topic    
Author Message
seijo

Joined: 30 Jul 2010
Posts: 8

View user's profile

Back to top
Link to postPosted: Mon Jul 14, 2014 4:57 pmPost subject: Tanimoto similarity between two compounds Reply with quote

I am a new JChem Marvin user, I appreciate it a lot

I actually compute Tanimoto dissimilarity with API and with a command line:

I obtain same fingerprint with API and command line, but Tanimoto calculated with the API (Ecfp1.getTanimoto(ecfp2) =0.8898305 ) is totally different than the Tanimoto calculate with the command line:

compr -f 1024 -t 1.0 -r -g -z -L -i fingerprints.txt fingerprints.txt -o data.txt  (Tanimoto here = Maximum dissimilarity between sets = 0.60294116 because there is 2 compounds)

 

I put the sdf to test, the data.txt obtained with the command line

Any help please?
I check the Tanimoto with the binari fingerprint and I obtained same value between this way and the API way. What do I do wrong with command line?

 

here after the way I do the API Tanimoto computation:

ECFPParameters ecfpparam = new ECFPParameters();
ECFP fp;

ecfpparam.setLength(1024); 

ecfpparam.setDiameter(8);

ecfpparam.setKeepCounts(false);

ECFP fp = new ECFP(ecfpparam);

ECFP fpSave = new ECFP(ecfpparam);

MolImporter mi = new MolImporter(filename);

Molecule m = mi.read();  

while (m != null) {                  

            // Instantiate default descriptor parameters and descriptors        
             ecfp = fp.generate(m);
          
            System.out.println("Tanimoto: "+ecfp.getTanimoto(ecfpSave)));

             ecfpSave=ecfp;

}

I obtained in the second position the Tanimoto number of interrest:

Tanimoto: 0.8898305


In agreement with binary fingerprint

seijo

Joined: 30 Jul 2010
Posts: 8

View user's profile

Back to top
Link to postPosted: Tue Jul 15, 2014 8:57 amPost subject: Reply with quote

I found it....

generatemd used with ECFP gives false decimal format for compr, jarp, etc

This gives false Tanimoto dissimilarity results with the command line after.

Clearly, these kind of informations have to appear in the manual. A complementary file, FingerprintConverter.java (attached to this message and extract from a hard to find old topic of 2011) have to be add to the library JChem with little explanation about the reason of its presence in my point of view.

https://www.chemaxon.com/forum/ftopic1472.html&sid=6b97639ee572ae0ac11ae2bd6bc2e6a8

 

To avoid any error when command line is used. To obtain a runable file of it:

javac FingerprintConverter.java


And the goal to obtain correct Tanimoto values with ECFP and command line is to generate binary fingerprint files with generatemd:

generatemd c file.sdf -k ECFP -2 -o BinaryFingerprints.txt

java FingerprintConverter BinaryFingerprints.txt correctDecimalFingerprint.txt

compr -f 1024 -t 0.1 -r -g -z -L -w -i correctDecimalFingerprint.txt  correctDecimalFingerprint.txt -o resultsData.txt


Gabor
ChemAxon personnel
Joined: 29 May 2005
Posts: 317

View user's profile

Back to top
Link to postPosted: Fri Jul 18, 2014 4:41 pmPost subject: Reply with quote

Sorry for missing your original post. I agree that the default decimal representation of ECFP descriptor family (list of feature identifiers) might causes confusion since one can expect a packed binary string representation. In the new descriptors API (under construction) we expose the folded binary representation by default.

regards,

Gabor

This topic is locked: you cannot edit posts or make replies.
Page 1 of 1


To watch this topic for replies   Register (enables digests) or give email address  
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum