Technical Support Forum Index
Technical Support Forum
Access ChemAxon scientists and developers here. For registration and login issues contact website support.

Support Ticket System is replacing forum

This forum was converted into a searchable archive. You cannot add posts here any more. For support please use our new Ticket System.

Create your first ticket
Tanimoto calculation
To watch this topic for replies  Register (enables digests) or give email address:
This topic is locked: you cannot edit posts or make replies.
Display posts from previous:   
    View previous topic :: View next topic    
Author Message
bjoern

Joined: 08 Nov 2013
Posts: 8

View user's profile

Back to top
Link to postPosted: Mon Jan 19, 2015 4:12 pmPost subject: Tanimoto calculation Reply with quote

Hi all,

I hope I have chosen the right forum part.

I need to calculate the similarity of two fingerprints (for example ECFP4).

Therefore I have calculated the binary ECFP fingerprints (generatemd) and stored them in a file. Now I want to calculate the similarity for two compounds. Or in best case all against all.

I have to do it on the command line. Therefore the "jcsearch" command should work but I am not sure how and if this is the right command.

Can anybody help me.

Best in advanced

Björn

Daniel
ChemAxon personnel
Joined: 22 Jan 2012
Posts: 315

View user's profile

Back to top
Link to postPosted: Mon Jan 26, 2015 3:13 pmPost subject: Reply with quote

Dear Bjorne, 

The easiest way to calculate similarity values for molecule pairs is to use our screenmd command line tool. 

If you have a for target and query molecules, you can use it as follows:

 screenmd targets.sdf queries.smiles -g -k ECFP -c ecfp.xml

I hope this helps.

Best regards,

Daniel

bjoern

Joined: 08 Nov 2013
Posts: 8

View user's profile

Back to top
Link to postPosted: Wed Apr 01, 2015 9:58 amPost subject: Reply with quote

Dear Daniel,

 

thanks for your answer.

This works well.

Please let me shortly summary the screenmd command:

- The result of this are the dissimilarity scores.

- No filter is used

TASK:

No I want to get directly the tanimoto values. Therefore I used the Tanimoto Metric from the ecfp.xml file.

screenmd input.sdf reference.cfp -g -k ECFP -c ecfp.xml -M Tanimoto

I get a result but the "tanimoto" values (it should now be the similarity and not dissimilarity) looks wrong. The predefined threshold is 0.2.

What goes wrong with this command? Or do I understand something wrong?

 

Best

Bjoern

 

Daniel
ChemAxon personnel
Joined: 22 Jan 2012
Posts: 315

View user's profile

Back to top
Link to postPosted: Wed Apr 01, 2015 1:45 pmPost subject: Reply with quote

Dear Bjoern, 

The screenmd can only generate dissimilarity values. You can either manually subtract them to get the Tanimoto similarity, or modify the threshold to 1-diss.thr. in the XML config file for the calculation.

Does this help?

Daniel

bjoern

Joined: 08 Nov 2013
Posts: 8

View user's profile

Back to top
Link to postPosted: Wed Apr 01, 2015 3:46 pmPost subject: Reply with quote

Hmm,

 

no I am really confused. I found the following describtion for the metric: (https://docs.chemaxon.com/display/jchembase/Similarity+search#Similaritysearch-search)

Metrics

Similarity / Dissimilarity metrics for molecules

Various metrics are provided in JChem to compute the value of similarity or dissimilarity. Some metrics (for example Tanimoto) provide similarity values, some other metrics (for example Euclidean) provide dissimilarity values. The values calculated with the metrics listed in the table below (with the exception of Euclidean) vary from 0 to 1. Similarity (S) value can be calculated from the value of dissimilarity(D): S = 1 - D (with the exception of Euclidean metric).

 

Actually I do not really understand how I could modify the threshold to 1-diss.thr. in the XML file. How should this looks like?

 

Best,

Bjoern

Daniel
ChemAxon personnel
Joined: 22 Jan 2012
Posts: 315

View user's profile

Back to top
Link to postPosted: Thu Apr 02, 2015 10:41 amPost subject: Reply with quote

Hi Bjoern, 

You can modify the dissimilarity threshold in the ECFP config XML:

ScreeningConfiguration>
        <ParametrizedMetrics>
            <ParametrizedMetric Name="Tanimoto" ActiveFamily="Generic" Metric="Tanimoto" Threshold="0.5"/>
            <ParametrizedMetric Name="Euclidean" ActiveFamily="Generic" Metric="Euclidean" Threshold="10"/>
        </ParametrizedMetrics>
    </ScreeningConfiguration>

In this case the diss. threshold is 0.5. This means that every target molecule that has dissimilarity (compared to the query) greater than 0.5 will be listed in the output. This is the same as listing every target molecule that has smaller similarity than 0.5 to the query. What you can do is set a diss. threshold and sort the output in increasing order based on the Tanimoto value column. Then the most similar targets will stand at the top.

Daniel

bjoern

Joined: 08 Nov 2013
Posts: 8

View user's profile

Back to top
Link to postPosted: Mon Apr 27, 2015 9:07 amPost subject: Reply with quote

Hey,

 

I am directly interested in the similarity matrix and not dissimilarity.

I tried some variants of Metrics. But nothing works.

So some questions:

- Do i use the right command for SIMILARIY search?

- is there a metric to do 1-dissim.

 

Best.

Bjoern

Daniel
ChemAxon personnel
Joined: 22 Jan 2012
Posts: 315

View user's profile

Back to top
Link to postPosted: Wed Apr 29, 2015 10:35 amPost subject: Reply with quote

Hi Bjoern, 

Unfortunately screenmd can only handle dissimilarity metrics. I suggest that you write a very simple script to calculate the similarity matrix based on the calculated dissimilarity. 

Daniel

This topic is locked: you cannot edit posts or make replies.
Page 1 of 1


To watch this topic for replies   Register (enables digests) or give email address  
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum