Reverse Similarity search

User fdee5ee126

08-07-2007 18:48:04

Hi there,


I have several Actives, which I want to use for building a database of "most probably Inactives". Therefore I want to do a "dissimilarity search" , i.e. retrieve just those compounds of a my database, that are dissimilar in terms of substructure Fingerprint Tanimoto.


I think some experienced dissimilarity/diversity experts can give some tips and tricks.


Did anyone else face (andeven better: solve) this Problem yet?





Thanks in advance,





Markus Kossner

User 677b9c22ff

10-07-2007 06:57:25

Hi,


active what?





Check this out:


1) Chemical Similarity – An overview


Dr. Nina Jeliazkova


"Structure is not the sole factor for biological activity"





2) MOSS or MOFA from Christian Borgelt





3) jarp ,generatemd, compr


"maximum dissimilarity" site:chemaxon.com






4) Chemical Structure Representation and Search Systems


John Barnard





Kind regards


Tobias

ChemAxon a3d59b832c

14-07-2007 20:35:05

Hi Markus,





Miklós, our similarity expert is on holiday, and I can only give you some hints from the database perspective. He will give a full answer here when he is back at the end of July.





All ChemAxon database tools can do "dissimilarity search" with one query (active). For this, you have to use the similarity search type, use negated search (non-hits), and set a similarity threshold value. After doing it individually on your actives, you can intersect, merge, etc. the results according to your preference.





Alternatively, you can use the Screen package, which (amongst others) allows the generation and use of hypothesis (consensus) fingerprint based on the actives. Screen is integrated into JChem Base or can be used on flat files and includes many different descriptors and similarity metrics. I suggest to start at the technical presentation: http://www.chemaxon.com/conf/Screen.ppt





I hope this helps.


Szabolcs

ChemAxon a3d59b832c

14-07-2007 20:53:11

Sorry, I forgot to mention that the default similarity in the database is tanimoto metric on chemical fingerprint just as you mentioned in your post.