jcsearch difference in searching against a file vs. database

User 5095fcb72d

02-04-2008 21:53:32

Hi all,





I am a bit confused about jcsearch's behaviour when I search for molecules in a database. If I search for multiple structures against a smiles file I get all the results I expect, but if I search against a database table I only get the first molecule in the file.


An example of my output is below:


From Smiles:


jcsearch -q test.smi test.smi --or -t:e -f :TName


CID000000003


CID000000004


CID000000005


CID000000006


CID000000007


CID000000008


CID000000009


CID000000010


CID000000011





From Database:


jcsearch -q test.smi DB:stitch_compounds --or -t:e -f :TChemical


CID000000003





The only difference is where I am searching (the DB contains the same compounds as test.smi)





Any ideas on where I am going wrong?





Thanks :)





Iain

ChemAxon a3d59b832c

03-04-2008 08:00:03

Hi,





The --or and --and options are not available for the database targets. (Sorry, the documentation was not clear about this.)





We are planning to develop this in JChem 5.2. (To be completed around the end of the year or early next year.)





Best regards,


Szabolcs

User 5095fcb72d

04-04-2008 01:19:30

Thanks for the reply.





I am a bit confused now though about what database I should use. Ideally I would like to create a local copy of PubChem (about 12million compounds), and use it to annotate compounds from some high throughput screens (a few thousand compounds)





I could use jcsearch, with one compound at a time searching the database but this seems to be quite slow and inefficient method of approaching the problem. Alternatively I could use Instant JChem, but I am finding that very confusing as it isn't very similar to SQL to me at all. I would like to use a mysql database. Can the academic version of Instant JChem interact with a local mysql database created with JChem Manager?





Alternatively, I am wondering if I used the free version of Oracle with the JChem Cartridge would that solve all my problems?





Thanks for any suggestions you can give :)





Iain

ChemAxon aa7c50abf8

04-04-2008 08:49:04

Quote:
Alternatively, I am wondering if I used the free version of Oracle with the JChem Cartridge would that solve all my problems?
Due to an essential feature missing from Oracle Express Edition (JChem Cartridge currently relies on Java Stored Procedures), Oracle XE is not supported by JChem Cartridge. Please, see the Software Requirements for JChem Cartridge: http://www.chemaxon.com/jchem/doc/admin/cartridge.html#req.





Peter

ChemAxon fa971619eb

04-04-2008 09:04:16

Quote:
Can the academic version of Instant JChem interact with a local mysql database created with JChem Manager?
Yes, Instant JChem can use a mysql database (local or remote) and this can also be accessed through jcman, But you need to ensure that the same version of JChem is used. However, I would not recommend using Instant JChem for a database of 12 million compounds.You will need very large amounts of memory for this.





You need an IJC license to use a MySQL or Oracle database, but you can obtain one under the academic package.





Tim

ChemAxon 9c0afc9aaf

04-04-2008 09:19:12

Quote:
However, I would not recommend using Instant JChem for a database of 12 million compounds.You will need very large amounts of memory for this.
Of course this also applies to all other methods of performing a structure search (jcsearch, cartridge, JSP web application), with the difference that in the case of the cartridge and web applications this memory need does not arise on the client side.


In the future Instant JChem will be able to connect to a central server too.





By The way 12 million compounds would require about 1.2 GB for the Structure Cache, which might not be totally unrealistic these days.


You have to adjust the allowed memory allocation for Instant JChem if managing huge tables.





Please see the FAQ for more information on caching and memory size:


http://www.chemaxon.com/jchem/FAQ.html#cacheSize





Best regards,





Szilard

ChemAxon aa7c50abf8

04-04-2008 10:20:11

Quote:
Thanks for any suggestions you can give
Your choice may be influenced by what interface you would prefer to use.





If you want to execute the searches through scripts, the immediate solution is jcsearch. jcsearch will take about 4 minutes to load 12 million structures each time you start a search with it: this will be your per-search overhead for jcsearch. If you run your jobs through overnight batches or similar, this overhead may or may not be acceptable depending on your requirements.





If you want/prefer/accept to use a graphical interface for searches, you can also use the JSP example of JChem Base (on-line demo installation here: http://www.chemaxon.com/jchem/examples/db_search/index.jsp ). As it is a JChem Base application, it can be configured to use any supported database system.





If you have some experience in Java programming or are not afraid of obtaining some, you can even modify the JSP example to suit your needs: for (an ambitious) example, you could extend it to a web service for a batch/script-based use. Note, though, that we have started work to provide a WEB-service interface for JChem.





Peter