What shall be the new name of exact search?

ChemAxon a3d59b832c

02-10-2008 09:45:56

There is often a confusion between exact and perfect search. The main reason of this confusion is that exact search of JChem currently differs from the exact search feature of other chemical database systems and toolkits.





In those terminologies, exact search is used to identify duplicates of molecules. Currently this kind of search is available in JChem as "perfect search". On the other hand, JChem's exact search is a special type of substructure search where the heavy atom network of the query and target molecules must be equal for a match. All other features are treated the same way as for substructure search. (E.g. query atoms, query properties, stereochemistry, formal charges, radicals, etc.) A related search type is exact fragment search, where additional fragments can also be present in the results in addition to the matched fragment. Some examples of the current JChem search types are attached, further examples can be seen on this page: http://www.chemaxon.com/jchem/doc/user/query_searchtypes.html





To reduce the confusion, we decided to rename exact search in JChem from JChem 5.2. (Planned to be released in the first half of 2009.) In a later release, perfect will further be renamed to exact.





Now the question is what name shall we give to the current exact search that properly describes its behavior? Exact fragment should be renamed to follow the same terminology.





Short names would be preferable.





Some ideas are:





A. Whole structure search, whole fragment search


B. Full size substructure search, full size substructure fragment search


C. Full structure search, full fragment search


D. Complete structure search, complete fragment search





We welcome any ideas, opinions and votes.

ChemAxon 42004978e8

02-10-2008 12:41:21

Still some thoughts about matching H or not.


As I understand so far (this type of) exact search doesn't exist at other systems. They use this name for our perfect search.


We define exact which is something between sss and perfect search. Exact matching of .the heavy atoms is one definition.


(maybe there is the assumption, that this means mostly the matching of the Hs.as well..)





To have a full molecule matching one would require matching of the hydrogens as well. There would be the same number of atoms. However this can't be fulfilled with query atoms.





For me it's all right with and without checking Hs, but if Hs are checked, it's only a definition question. However if Hs are checked, then this requires a separate handling of query atoms (this checking should be skipped).

User 870ab5b546

02-10-2008 12:49:50

How about "loose exact"? It retains the implication of exactness, but suggests that the query could match several different targets.

ChemAxon a3d59b832c

02-10-2008 13:51:51

If possible, I would avoid the word "exact" to prevent any more confusion.

User 8688ffe688

02-10-2008 15:15:37

What about "relax" search? Another would be "atomic" search?

ChemAxon fa971619eb

03-10-2008 09:42:55

I agree that the terms are misleading, and also that it is difficult to come up with better ones.





Even the current perfect search is misleading, whether it is called perfect or exact. This is because the standardization rules affect this, so 2 structures that are not identical can be considered duplicates in the database. For this reason maybe the name "duplicate search" is most appropriate for this?





As for the current exact search, what about the name "scaffold search" as what this really is is a comparison of the heavy atom scaffolds, with varying degrees of tolerence of variations in atom and bond features. But the scaffolds themsleves must be identical.





Just my 2 cents worth.





Tim

ChemAxon a3d59b832c

03-10-2008 11:03:39

tdudgeon wrote:



Even the current perfect search is misleading, whether it is called perfect or exact. This is because the standardization rules affect this, so 2 structures that are not identical can be considered duplicates in the database.


One goal of the standardization is that different representations of the same molecule are meant to be recognized. In this respect, it is correct to return these records.
tdudgeon wrote:



For this reason maybe the name "duplicate search" is most appropriate for this?


Szilard also proposed this name and it makes sense. However, as previously said, if we want to adopt the other terminologies, we will eventually have to rename perfect search to exact.

User 8139ea8dbd

06-10-2008 05:24:49

I guess the reason "exact search" was introduced was to find structures that are equivalent to the user, but may not be "perfectly same" for registration purpose.





Obviously, what is considered "equivalent" is further refined by the additional search options.

User 7b0ee04e66

08-10-2008 15:09:10

Here, we tend to use 'flat match' to replace 'exact'


Catherine

User 870ab5b546

08-10-2008 15:12:43

How about "good match"?

ChemAxon a3d59b832c

13-10-2008 06:43:08

Thank you for the suggestions, now we have plenty names to choose from. :)





I think I more or less understand the rationale behind all names except "flat match".





Catherine, can you explain why you tend to use it for exact search?





Best regards,


Szabolcs