I need to be able to rank a list of small to medium sized organic compounds in order of their individual similarities to a reference compound - based on some similarity measure such as the Tanimoto coefficient.

I am new to J Chem and it would be great if someone could give me a run down on the steps needed to be able to do this.

I have downloaded J Chem base (as of yesterday) onto my computer (running windows xp) but that is pretty much as far as I could get.

Have you tried InstantJChem yet? That's probably the easiest and most convenient tool for this kind of work. You can try it online here: http://www.chemaxon.com/products/online-tryouts/instant-jchem-via-webstart/

Yes, with Instant JChem you could easily create a database of your compounds and then search them with you reference compound(s) using similarity search. Other things are possible too, but I think that should meet you basic need. What you would need to do is:

1. run IJC and create a project with a local database
2. import your structures into that database
3. run a similarity search for you reference compound

For more details see these animations:




many thanks for that - IJC does exactly what we want...


have just been playing around with the similarity query...

After running a "structure - similarity" query I am assuming that the number above the compound structure in the Structure column of the resultant list is the associated Tanimoto coefficient (when using this as the similarity measure)?

However the resulting list does not strictly rank the compounds by these numbers (though there is a trend to the numbers) and compounds that are ranked lower (i.e. lower on the list) often have slightly higher numbers than compounds ranked above them (again I'm assuming these numbers refer to the associated Tanimoto coefficients).

Have I mistaken the what this number actually refers to?



Yes, you are right. The ordering is from most similar to least similar, and the numbers displayed are the similarity scores.

And yes, the number's are slightly inconsistent. This results from them being calculated in a slightly different way from the way used in the ordering of the search results.

We are going to try to remove this inconsistency.


That is correct, currently the visualization routine in IJC is using a constant similarity measure. - And this is very likely to be different from the table's own similarity measure.


This situation will be solved in the next major version (5.5). The plan is that the displayed similarity score will always use the same settings as used in the table/query.


