Import from Symyx / MDL Database

User 74b72c099e

21-06-2011 11:02:31

I am developing an application which currently uses an MDL / Symyx database to store all the structures. I want to be able to use JChem Base / Cartridge to handle these structures instead so I can perform structure searches etc. throuhg the cartridge.


Is there a simple way of importing the Symyx / MDL Database into tables the JChem Base / Cartridge can handle and work with?

ChemAxon a3d59b832c

21-06-2011 11:16:39

Hi,


 


If you are migrating from MDL / Symyx Direct to JChem Cartridge, then there is no need for re-importing.


Simply put a JChem index on the column where the MDL / Symyx index has been and you are done.


 


Best regards,


Szabolcs

User 74b72c099e

23-06-2011 13:05:33

Thank you Szabolcs


One more question: Is there a default limit on the size of the table you associate this index with? After exactly 21000 structures have been "processed" the index creation stops with a java.lang.Exception:


Error while processing ROWID=AAASYLAAFAAACEWAAD: Inserting a query or Markush structure is not allowed for table: "MOLTABLE_JCHEMIX_JCX"

The table I'm trying to create the index for has more than 320000 structures.


 


 

ChemAxon a3d59b832c

23-06-2011 13:26:37

Hi,


See explanation here on this similar problem:


https://www.chemaxon.com/forum/viewpost33438.html&highlight=inserting+query+markush+structure+allowed+table#33438


 


Basically you will need to use the "any table" index option of CREATE INDEX:


http://www.chemaxon.com/jchem/doc/dev/cartridge/index.html#indexparam_structureType


 


Best regards,


Szabolcs

User 74b72c099e

29-06-2011 08:56:24

Thank you


I managed to create an index on my MOLFILE column in my MOLTABLE table which created a MOLTABLE_JCHEMIX index along with a MOLTABLE_JCHEMIX_JCX table. Is there any way to connect these two tables? My MOLTABLE contains a lot of information about the structures including a self-assigned StructureID etc. and I would like to be able to get e.g. the molweight or molformula directly from the MOLTABLE_JCHEMIX_JCX table using this StructureID. Or am I misunderstanding how this new table is supposed to be used?

ChemAxon aa7c50abf8

29-06-2011 11:12:00

The new table (the index table) is not supposed to be used by users/applications directly. It is supposed to be used through SQL operators and functions defined by JChem Cartridge: http://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html . For example, jc_molweight will return the molweight, jc_formula will return the formula. These operators know how to use the index table.

User 74b72c099e

30-06-2011 09:54:31

Ahh, I see - I just wondered what this table was for if I didn't have access to it but when it's provided through the index that's fine.


My new problem is that I get an error when trying to use the jc_molweight function in a trigger to update the molweight value I store in my MOLTABLE:


CREATE OR REPLACE
TRIGGER trg_moltable
BEFORE INSERT OR UPDATE ON moltable
REFERENCING OLD AS OLD NEW AS NEW
FOR EACH row
-- PL/SQL Block
begin
SELECT jc_formula(:new.molfile) INTO :new.MOLFORMULA FROM dual;
SELECT jc_molweight(:new.MOLFILE) INTO :new.MOLWEIGHT FROM dual;
end;
/

The trigger is created without any problems but whenever it is used (i.e. when I add a new structure or change an existing one) I receive the following error: 


ORA-06502: PL/SQL numeric or value error: character to number conversion error
ORA-06512: at "JCHEM.MOLWEIGHT_FUNCC", line 30

Executing


SELECT jc_molweight(:some_molfile) FROM dual

works without any problems.

ChemAxon aa7c50abf8

01-07-2011 11:35:03

The char to number conversion error typically occurs when the NLS settings of the database session don't match the locale/format settings of the JChem Cartridge (JCC) server. For example on Windows, the "Region and Language" settings of the user running the JCC server affect what locale/format the JCC server will apply.


We had a related bug in an earlier JCC version which has been fixed for some time now: https://www.chemaxon.com/forum/ftopic5697.html&start=0&postdays=0&postorder=asc&highlight=swedish . I now tried to reproduce with Swedish locales your problem of calling jc_molweight in a trigger without success. Please, could you provide more details on your environment:


1. The NLS settings used in the Oracle session executing the INSERT/UPDATE:


SELECT 'NLS_DATE_LANGUAGE' AS Parameter, SYS_CONTEXT('USERENV','NLS_DATE_LANGUAGE') AS Value FROM Dual
UNION ALL
SELECT 'NLS_TERRITORY' AS Parameter, SYS_CONTEXT('USERENV','NLS_TERRITORY') AS Value FROM Dual
UNION ALL
SELECT 'Decimal marker' AS Parameter, TO_CHAR(3/7, '0D999999') AS Value FROM DUAL
UNION ALL
SELECT 'Thousands group marker' AS Parameter, TO_CHAR(1000*1000, '999G999G999G999') FROM DUAL;

2. Are the global NLS settings (defined as database initialization parameters) overridden in the session scope (using ALTER SESSION statements or equivalent)?


3. Is the table MOLTABLE stored in the same schema where the trigger is defined?


4. Are the table and the trigger stored/defined in the schema of the user who is executing the INSERT/UPDATE ?


Please, could you try using the jcf.molweight function instead of the jc_molweight operator as a workaround? (In contrast to jc_xxxx operators, jcf functions don't have access to JChem index data, but they are less sensitive to non-trivial configuration problems -- and, as you use literal values in this particular trigger, you don't have [cannot access] here JChem index data anyway.)


User 74b72c099e

04-07-2011 12:55:20

I have already tried with different combinations of region settings on the client and the server hosting the database without any success. To answer your questions:


 


1. The NLS settings used in the Oracle session executing the INSERT/UPDATE:

 


 


"PARAMETER" "VALUE"


"NLS_DATE_LANGUAGE" "DANISH"


"NLS_TERRITORY" "DENMARK"


"Decimal marker" " 0,428571"


"Thousands group marker" "       1.000.000"


 


2. Are the global NLS settings (defined as database initialization parameters) overridden in the session scope (using ALTER SESSION statements or equivalent)?

I don't believe so, no.



3. Is the table MOLTABLE stored in the same schema where the trigger is defined?


Yes.



4. Are the table and the trigger stored/defined in the schema of the user who is executing the INSERT/UPDATE ?

Yes.



 

ChemAxon aa7c50abf8

05-07-2011 09:16:00

1. Did you try using jcf.molweight instead of jc_molweight in the trigger?


 


2. Please, could you temporarily insert the line in the trigger at the beginning of the PL/SQL block (immediately after "begin"):


raise_application_error(-20100, to_char(1000*1000 + 3/7, '999G999G999G999D999999'));

? This is to check whether the same NLS settings are used in the trigger as outside the trigger. (I see no reason why they wouldn't be, but...) Does the resulting error message show dot for the thousand group separator and comma for the decimal separator as expected:


ORA-20100:        1.000.000,428571

? (Please, remove the raise_application_error...line after this test.)


 


3. If the result for action #2 is as expected, could you, please, execute the following SQL command before executing the failing INSERT/UPDATE in the same database session where the INSERT/UPDATE is about to be executed


call jcart_logger.set_log_level('chemaxon', 4);


? With this log level set, the INSERT/UPDATE operation will generate some logs in the Oracle session trace files in the C:\app\Administrator\diag\rdbms\mydb\mydb\trace directory in my environment. Please look for the string calcMolProp: retval= . There should be two log entries containing this string: one for formula and one for weight. Please, could you copy paste here the lines containing this string?

User 74b72c099e

05-07-2011 11:35:13

I did some more testing - it appears that the application actually uses the American number format when I run it (i.e. executing "select SYS_CONTEXT('USERENV','NLS_TERRITORY') from dual;" from within the application returns AMERICA).


Previously I did my testing in SQL Developer (where "DENMARK" was returned in the above query). If I use sqlplus "AMERICA" is returned as well - and both jcf.molweight and jc_molweight fails in the same manner as described before:


SQL> select SYS_CONTEXT('USERENV','NLS_TERRITORY') from dual;

SYS_CONTEXT('USERENV','NLS_TERRITORY')
--------------------------------------------------------------------------------

AMERICA

SQL> select jc_molweight('C=C') from dual;
select jc_molweight('C=C') from dual
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character to number conversion error
ORA-06512: at "JCHEM.MOLWEIGHT_FUNC", line 27


SQL> select jcf.molweight('C=C') from dual;
select jcf.molweight('C=C') from dual
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character to number conversion error
ORA-06512: at "JCHEM.JCF", line 646



1. Did you try using jcf.molweight instead of jc_molweight in the trigger?

Yeah, jcf.molweight was no different.


2. Please, could you temporarily insert the line in the trigger at the beginning of the PL/SQL block (immediately after "begin"):

ORA-20100: 1,000,000.428571


(So American number format)


 


3. If the result for action #2 is as expected, could you, please, execute the following SQL command before executing the failing INSERT/UPDATE in the same database session where the INSERT/UPDATE is about to be executed

call jcart_logger.set_log_level('chemaxon', 4);

 


DEBUG - chemaxon.jchem.cartridge.JFunctions: calcMolProp: retval=162,2713


(So Danish number format?)

ChemAxon aa7c50abf8

06-07-2011 08:52:47

Do you need to support differrent locales on a session-by-session bases? If it is OK to use the same locale for all database sessions in the same JChem Cartridge instance, the simplest is to make sure the database and JChem Cartridge server host uses the same Region and Language settings.

User 74b72c099e

11-07-2011 08:06:00

All database sessions are run in AMERICAN sessions but even if I set the server's region to English (United States) JChem still complains when I run AMERICAN sessions while DANISH sessions run just fine. How do I change the region setting of JChem? And how come it doesn't just comply with the current session's region?

ChemAxon aa7c50abf8

11-07-2011 10:59:30

All database sessions are run in AMERICAN sessions but even if I set the server's region to English (United States) JChem still complains when I run AMERICAN sessions while DANISH sessions run just fine.

Did you change the server settings at the right place...?


How do I change the region setting of JChem?

To change the JChem Cartridge server locales to Danish/Denmark, you have to add the following lines to the jchem/cartridge/jcart.properties file:


sysprop.user.language=da
sysprop.user.region=DK

Please, see http://www.oracle.com/technetwork/java/javase/locales-137662.html for more locales.


And how come it doesn't just comply with the current session's region?

This is a limitation of the current JChem Cartridge implementation. It should be possible to remove this limitation, though. Do you need to support multiple locales with the same JChem Cartridge instance?

User 74b72c099e

19-07-2011 11:22:21

Changing the JChem Cartridge server locales solves the issue for now.


Thank you very much for your help!

User 74b72c099e

28-09-2011 11:00:28

I have a problem with extremely slow searches. I have created the index as described above but the search speeds are not satisfactory. For instance, a specific substructure search which returns 45 structures takes about 1 second using Symyx/MDL while it takes more than 95 seconds to perform the same search using JChem Cartridge. I am running the JChem Cartridge on a virtual server which might slow down the process a bit but I still find a 95 time increase in search time to be significant. Any suggestions on what I could be doing wrong?

ChemAxon aa7c50abf8

28-09-2011 11:05:30

Please, could you provide



  1. the SQL query,

  2. the size of the structure table,

  3. the number of (expected) hits ?

User 74b72c099e

28-09-2011 12:21:39

1. SQL query:


SELECT *
FROM moltable
WHERE
jc_compare(molfile, :structure, 't:s') = 1;

where the :structure-binding is


 
Mrv0541 09281113412D

12 13 0 0 0 0 999 V2000
2.4429 -4.0932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.4417 -4.9201 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.1561 -5.3326 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.1543 -3.6807 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.8692 -4.0896 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.8700 -4.9159 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.5848 -5.3266 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.2993 -4.9121 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.2947 -4.0826 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.5792 -3.6757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.7239 -5.3258 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
6.0087 -5.3216 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
5 6 1 0 0 0 0
3 6 2 0 0 0 0
6 7 1 0 0 0 0
1 2 2 0 0 0 0
7 8 2 0 0 0 0
5 4 2 0 0 0 0
8 9 1 0 0 0 0
4 1 1 0 0 0 0
9 10 2 0 0 0 0
10 5 1 0 0 0 0
2 11 1 0 0 0 0
2 3 1 0 0 0 0
8 12 1 0 0 0 0
M END

2. Size of structure table: 322,534


3. Number of hits: 47


Time: 91.832 seconds

ChemAxon aa7c50abf8

28-09-2011 12:36:33

How long does it take this one:



SELECT *
FROM moltable
WHERE
jc_compare(molfile, :structure, 't:na') = 1;


?

User 74b72c099e

28-09-2011 12:54:27

That search completes in less than a second - but returns 1324 structures.

ChemAxon aa7c50abf8

28-09-2011 12:59:43

 


Please, could you do the following:


  1. Add the following lines to the
    jchem/cartridge/conf/logging.properties file of the JCC server

    chemaxon.jchem.db.JChemSearch.stats.level = FINEST

    chemaxon.jchem.cartridge.level = FINER

  2. Restart the JCC server

  3. Execute the problematic query twice

  4. Send me the log files in jchem/cartridge/logs/ to pkovacs at chemaxon dot com


 


?

User 74b72c099e

28-09-2011 13:23:15

Log files sent :) Your help is appreciated.

ChemAxon aa7c50abf8

29-09-2011 16:55:51

Thank you!


We will investigate the problem (JC-3072).


Peter

User 74b72c099e

04-10-2011 12:44:34

Any ideas where I could start looking for a solution to (/the cause of) this problem?

ChemAxon a9ded07333

05-10-2011 09:11:32

Hi Mikkel,


I could not reproduce the slowdown on a similar size (300.000 structures) NCI database, the search took only 1.5 seconds.
Could you send the structures in your database - in case they are not confidential - to tcsizmazia at chemaxon dot com?


Regards,
Tamás

User 74b72c099e

06-10-2011 08:27:38

That's not really possible. The database contains a lot of confidential structures that I am not at liberty to hand out.

ChemAxon a9ded07333

17-10-2011 07:19:50

Hi Mikkel,


For further investigation could you please provide the following information:



select jchem_core_pkg.get_idx_stats(<idx_schema>, <idx_name>, null) from dual;

Are the jchem server and cartridge installed on the same machine or they communicate over network?


Please add the following line to the
jchem/cartridge/conf/logging.properties file of the JCC server


chemaxon.jchem.db.JChemSearch.stats.level = FINEST


then measure the time of the following queries, running them 2-3 times in random order and send the log files to me (see email address in my previous post)



 


Regards,
Tamás

ChemAxon aa7c50abf8

20-01-2012 19:07:46




How do I change the region setting of JChem? And how come it doesn't just comply with the current session's region?

JChem 5.8.0 has been released with all known instances of reliance on the JChem Server's locale (region settings) removed.


Peter