Replacing jc_evaluate in WHERE clause with jc_compare

User 104a68add6

05-07-2005 19:17:29

Hi,





We are trying to use the cartridge for a bioavailability search....not really needing any comparisons. Your docs provide a snippet of bioavailability type functions, but I could use some exact syntax help!





Here is an example of what we'd like to do. I understand that we can't use this syntax with jc_evaluate in the WHERE...just showing you our goal query:





select


ct_number, jc_molweight(CD_SMILES) mw, jc_evaluate(CD_SMILES,'fusedAromaticRingCount') FArRg_N


from


structure


where


ct_number between 340000 and 340055


and


jc_evaluate(CD_SMILES,'mass') <= 500 and


jc_evaluate(CD_SMILES,'logP') <= 5;





The examples you provide of using jc_compare with the ctFilter come quite close to what we want, and perhaps I'd use the filterQuery option for the BETWEEN part of the WHERE clause:





SELECT count(*) FROM nci_10m WHERE jc_compare(structure, 'c1ccccc1', 'sep=! t:s!ctFilter:logp()>9') = 1;








SELECT count(*) FROM nci_3m WHERE jc_compare(structure, 'O=C1ONC(N1c2ccccc2)-c3ccccc3','sep=! t:s!


ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10)') = 1;








In the first example, not only do you look at the logP, you make the structure contain an aromatic ring. I want to skip the aromatic ring part of the query, and just do the other evaluations. Do I make it a null string, or put a 'C' in it (since all the structures will have a Carbon)? I'd rather not do that part of the compare at all...is there a wildcard or an 'Any' type of clause?





Thanks in advance,


Julie

ChemAxon aa7c50abf8

06-07-2005 08:58:19

Julie,





You can use jc_evaluate in the WHERE clause. The point in the docs is to dissuade users as much as possible from structuring their conditions in an "intuitive" way and to stress the performance impact of the selected formulation.





In order to determine the best formulation in terms of performance, I suggest to experiment with the various options available.





The following statement could be a good start:





Code:
select


ct_number, jc_molweight(CD_SMILES) mw, jc_evaluate(CD_SMILES,'fusedAromaticRingCount') FArRg_N


from structure


where ct_number between 340000 and 340055 and jc_evaluate(cd_smiles, 'mass()<= 500 && logp() <=5') = 1;








I believe that a combined mass and logp chemical term will yield the best performance in most cases (as opposed to specifying them with separate jc_evaluate operators).





If the condition on ct_number is highly selective, you can leave it as a "stand-alone condition". If its selectivity is low (or you are not sure), it may be useful to experiment with jc_compare specifying '*' as the query structure and including both the chemical terms filter and the ct_number condition as a filter in the jc_compare options.





Peter

User 104a68add6

06-07-2005 15:22:54

Thanks Peter,





The && does not seem to work, as this is being executed in SQL*Plus, and I get a prompt (SQL*Plus thinks it is an argument to be input). If I replace that with "and", or just take off the logP to do a test, I get the following error message:





select


ct_number, jc_molweight(CD_SMILES) mw, jc_evaluate(CD_SMILES,'fusedAromaticRingCount') FArRg_N


from structure


where ct_number between 340000 and 340055 and jc_evaluate(cd_smiles, 'mass()<= 500') = 1;


select


*


ERROR at line 1:


ORA-29902: error in executing ODCIIndexStart() routine


ORA-29532: Java call terminated by uncaught Java exception:


java.lang.Exception: The following exception has been thrown by the servlet:


Exception: java.io.IOException: file format not recognized


ORA-06512: at "JCHEM.JCHEM_CORE_PKG", line 0


ORA-06512: at "JCHEM.JC_IDXTYPE_IM", line 243


ORA-06512: at line 1





This was what I got originally, when I discovered the documentation note about jc_evaluate in the WHERE clause. And that is why I went to jc_compare. What am I missing here?





Regards,


Julie

ChemAxon aa7c50abf8

06-07-2005 15:41:45

Julie,





This is an SQL Plus specific issue. Check the description of the SET DEFINE ... command in "SQL*PlusĀ® User's Guide and Reference". (You can for example disable substitution variable parsing by executing SET DEFINE OFF in the SQL Plus command line. You can also change the substitution variable marker from '&' to something else.)





Peter

User 104a68add6

06-07-2005 16:10:51

Hi Peter,





I think you missed the next part of my message. Even with out any & & or and...just a single jc_evaluate (using mass) in the WHERE I get that error:





SQL> select


ct_number, jc_molweight(CD_SMILES) mw, jc_evaluate(CD_SMILES,'fusedAromaticRingCount') FArRg_N


from structure


where ct_number between 340000 and 340055 and jc_evaluate(cd_smiles, 'mass()<= 500') = 1;


select


*


ERROR at line 1:


ORA-29902: error in executing ODCIIndexStart() routine


ORA-29532: Java call terminated by uncaught Java exception:


java.lang.Exception: The following exception has been thrown by the servlet:


Exception: java.io.IOException: file format not recognized


ORA-06512: at "JCHEM.JCHEM_CORE_PKG", line 0


ORA-06512: at "JCHEM.JC_IDXTYPE_IM", line 243


ORA-06512: at line 1





I understand about using && - I can esape it or redefine it as you suggest...but I need to get past the above message first. I am pretty sure the above query is correctly written. I took your query and just removed the logP portion, making it even simpler. Did that make sense?





Thanks,


Julie

ChemAxon aa7c50abf8

06-07-2005 16:23:15

I realized now that I have not implemented support for empty and null smiles along this path of execution. Do you have empty and/or null smiles in your table? That could explain the error message.

User 104a68add6

06-07-2005 16:29:17

Yes, there are nulls in the SMILES column for some of the entries. Perhaps that is why it is hanging when I do SET DEFINE OFF and try to combine a query with mass and logP (just tried that a while ago, and it has hung 3 times). The single, simple queries just give the error message I sent you earlier.





Julie

ChemAxon aa7c50abf8

06-07-2005 16:39:06

OK! I will fix it for the next release.


Peter

User 104a68add6

06-07-2005 16:45:59

Great! Thanks a lot for the work, Peter....





Could you give me a jc_compare workaround in the meantime...referring back to my original message?





Regards,


Julie

ChemAxon aa7c50abf8

07-07-2005 10:16:46

Julie,





I suggest the following SQL as workaround:





Code:
select ct_number, cd_molweight mw, jc_evaluate(cd_smiles, 'fusedAromaticRingCount') FArRg_N from structure where jc_compare(cd_smiles, '*', 'sep=! t:s!ctFilter:mass()<= 500 && logp() <=5!filterQuery:select cd_id from structure where ct_number between 340000 and 340055') = 1;






Note that molweight is precomputed and stored in the cd_molweight column of JChem-tables. There is no need for calling "jc_molweight(CD_SMILES)".





Peter





PS:


This "workaround" statement may even be faster (due to the use of filterQuery with jc_compare) than if you used jc_evaluate(). This depends on the actual selectivity of the conditions.

User 104a68add6

07-07-2005 14:22:40

Thanks Peter,





I had started experimenting with the '*' but wasn't certain if I was heading the right way. I appreciate all the help!


Julie