Search and data output from multiple tables

User b8eed45673

30-06-2008 14:19:32

Except for the main table with structures I use additional table with supporting data. Fields from the additional table are queried when search is executed: requested values are transfered to the method setFilterQuery() of JChemSearchOptions class. But before query results can be passed to output I have to run one more SQL-query to receive data from the additional table according to initially specified conditions - this looks like redundant action as these records have been already fetched before. Is there a way to use only a single SQL-query? Can I somewhow use the new method getHitsAsMolecules() (introduced in 5th version) to retrieve field values from multiple tables instead of the one, specified by method setStructureTable())?

ChemAxon a9ded07333

01-07-2008 12:30:30

setFilterQuery() uses only the cd_id field of the given SQL-query. So the best you can do is what you described: use setFilterQuery() and after the search run an SQL-query on the additional tables using IDs returned by the search.





getHitsAsMolecules() fetches fields only from the structure table where the molecules are stored.





Tamás

ChemAxon 9c0afc9aaf

01-07-2008 15:02:58

Quote:
requested values are transfered to the method setFilterQuery() of JChemSearchOptions class. But before query results can be passed to output I have to run one more SQL-query to receive data from the additional table according to initially specified conditions - this looks like redundant action as these records have been already fetched before.
You should not specify any other field to setFilterQuery() than cd_id. Though it causes no harm it can reduce efficiency.





It would not make sense even theoretically to use the SQL in setFilterQuery() to retrieve these fields for display because


- not all of these rows will be hits, so data would be moved unnecessary


- caching these values could consume a huge amount of memory which the server might not be able to handle





Best regards,





Szilard

User b8eed45673

02-07-2008 13:34:54

But I do selection only from cd_id field in setFilterQuery(). The main problem is: set of cd_id values (contained within searcher object) in my case does not contain the only required record identifiers, as data are distributed over several tables. I set query parameters instantly to search these additional tables while running main table search - and I'm getting set of cd_id values. However, to retrieve results containig information from all these additional fields, I have to run the same query over additional tables again, using set of previously fetched cd_id values. This is somewhat inconvenient in case of large sets, and also affects performance. I was expecting to use single SQL query to to avoid running the same query from additional tables for the second time.

ChemAxon 9c0afc9aaf

09-07-2008 12:21:49

Hi,





The setFilterQuery() is intended to speed up searches in cases when running the select is relatively quick compared to the time saved by not having to run the structure search on the whole table.


If the search is very quick (e.g. small table, very specific query structure), and the select takes a long time (complex database operation), it might be faster NOT to use the filter query option, but filter the hits later.


To decide which solution is better is up to the application developer.





I advise to take care of indexing database columns so these selects run as fast as possible.





The important difference between the filterQuery select and the second one is that the although the filterQuery may refer to those additional tables and their fields, none of these fields should be returned by it (they should only appear after the "where": before the "from" only the cd_id column should be listed). Although the evaluation after the where (e.g. joins) still hase to be made, this prevents actually pumping those values trough JDBC in vain.





If you have any workable idea how to improve this without the problem I mentioned above (namely: storing all the field values when executing filterQuery for possibly many millions of hits) we would consider it.





Best regards,





Szilard