K-Nearest Neighbor

User 13895fa0b3

20-07-2010 22:22:43

Hello there,


I have a structure database, and I want to be able to perform a K-Nearest Neighbor. In other words, I always want the top K results returned, regardless of their similarity. What is the best way to go about doing this?


Currently, I'm thinking of doing a search where I limit the results to K hits, but set the similarity to 0.1 (or something else really low). However, I'm wondering if this is guaranteed to return the top hits, in order?


Perhaps there is a better way to do this?


Thanks,


-Roman

ChemAxon 9c0afc9aaf

21-07-2010 03:31:10

Hi,


 


You are guaranteed to get the most similar results if using JChem 5.2.5 or later.


(that's when we have changed the behavior as most users would expect it this way)


From the list of changes:


https://www.chemaxon.com/jchem/changes.html


Similarity search returns most similar hits if hit count is limited. The maxTime option is recommended instead of maxHits in the case of descriptor similarity search. Forum topic

Since you actually have to provide a dissimilarity threshold (so we can support various metrics) you should set this value really high.


Best regards,


Szilard