Oracle trace file is being produced when SMILES is invalid

User 9f6f294e9f

20-05-2010 07:41:39

Hi


One of our DBA's noticed that we have a significant number of Oracle trace files being produced as a result of what appears to be incorrect SMILES strings. Here's part of a trace file:


*** 2010-05-19 11:58:01.224
*** ACTION NAME:() 2010-05-19 11:58:01.223
*** MODULE NAME:() 2010-05-19 11:58:01.223
*** SERVICE NAME:(SYS$USERS) 2010-05-19 11:58:01.223
*** CLIENT ID:() 2010-05-19 11:58:01.223
*** SESSION ID:(805.2712) 2010-05-19 11:58:01.223
2010-05-19 11:58:01.219 [14578533, REG_POC, REG_POC]: ERROR - chemaxon.jchem.cartridge.oresident.nonidxscan.NisStructureSearch: java.rmi.ServerException: RemoteException occurred in server thread; nested exception is:
 java.rmi.RemoteException: Unmatched closing branch bracket in SMILES string at character 83 ON1C(=O)CC(C1=O)S([O-])(=O)=O3COC(=O)c4c(C)c(O)cc(O)c4CSCC(NC3=O)C(=O)N[C@H](C)C=O)[C@@]1([H])NC(=O)N2 |r|
 at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
 at sun.rmi.transport.Transport$1.run(Unknown Source)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Unknown Source)
 at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
 at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
 at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java)
 at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java)
 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:133)
 at chemaxon.jchem.cartridge.rmi.impl.MolSearchImpl_Stub.search(MolSearchImpl_Stub.java:53)
 at chemaxon.jchem.cartridge.oresident.nonidxscan.NisStructureSearch.execute(NisStructureSearch.java:113)
 at chemaxon.jchem.cartridge.JFunctions.execFunction(JFunctions.java:458)
 at chemaxon.jchem.cartridge.JCFunctionsClob.execFunction(JCFunctionsClob.java:68)


Please will you take a look at the code in question and consider whether this sort of exception can be handed gracefully. The cartidge version is 5.3.2.


Regards



Ant

ChemAxon aa7c50abf8

20-05-2010 08:14:11

Hi Ant,


These appear to be user errors (invalid input) which are supposed to be re-thrown to the application which has called the JChem Cartridge function/operator generating the exception. (It is a bug, if they are not re-thrown to the application.) They are also logged for easier tracking, should any uncertainties arise about the errors' nature or cause/source "after the fact". (The same error message is supposed to be found in JChem Server's logs with a different Java stack trace.)


What improvement do you have in mind when talking about graceful hand[l]ing?


Thanks


Peter

User 9f6f294e9f

20-05-2010 08:39:29

Hi Peter


Yes, I've noticed these entries in the jchem log file previously. By 'handled gracefully' I mean that the application (in this case the cartridge) should catch the invalid syntax and then raise an error appropriate to the environment with a meaningful message. For an Oracle database this would probably be carried out by calling raise_application_error - here's a very simple example :


SQL> begin
  2   raise_application_error(-20001,'Invalid query syntax, smiles entered is invalid');
  3  end;
  4  /
begin
*
ERROR at line 1:
ORA-20001: Invalid query syntax, smiles entered is invalid
ORA-06512: at line 2


I see no valid reason for raising an oracle trace file - database trace files should be produced only when the application (in this case the cartridge) encounters something unexpected that can't be handled by its code. Unfortunately we expect our users to make mistakes with their SMILES specifications on occasion, so this should be anticipated and catered for in the code. You said in your earlier reply that "It is a bug, if they are not re-thrown to the application". A meaningful error message shoud be raised to the application, but I suggest that just to re-raise an untidy java exception is actually a bug in itself !


I'm ambivalent as to whether it's worth logging an invalid smiles string in the Jchem log file. if you think that there's some value in leaving an entry in the log file then carry on logging it by all means. For this sort of error. though, I don't see that there is any value in including the java stack trace in the Jchem log - what do you think ?


Regards


Ant

ChemAxon aa7c50abf8

21-05-2010 17:52:32

Hi Ant,


There appears to be at least three distinct aspects to your suggestion:



  1. What is contained in the error message

  2. How the error is raised (let Oracle raise an ORA-29532 error or replace that error with my own PL/SQL exception)

  3. What is traced/logged


What is contained in the error message


Whether a message is meaningful or not is sometimes subjective and/or dependent on the exact context the error is occurring in. One could argue that the error message you presented in your initial post is meaningful in general, even if it contains some "noise" not directly useful to the end user. (Extra information which can be considered "noise" in most, but not necessarily all scenarios.)


I agree that error messages should contain only the necessary pieces of information. But what is necessary may depend on the user of the error message. For example, should the error message about a structure format error contain the structure itself? Not necessarily, because the error message can tell, if the error is about the query or the target. But what about this scenario:


select count(*) from nci_1k where jc_contains(structure, 'Linux rulez!') = 1 or jc_contains(structure, 'Zazi dans le metro') = 1
*
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20001: Invalid query syntax, SMARTS entered is invalid
ORA-06512: at "PKOVACS_TRUNK.JC_IDXTYPE_IM", line 207

Which query structure is meant here? The query for the first JCC operator or for the second?



Shouldn't we also include the reason why the structure is incorrectly formed? With any structure of some complexity, the user is likely to be grateful having an indication of the cause so she instantly knows how to fix the problem. The error you reported in your initial post is a case in point.



Wouldn't there be some more complicated cases, when it would make everybody's life easier if the message included some indication of exactly where in the code the error occurred?



How the error is raised?



If a Java stored procedure generates an exception, the exception will be logged in the Oracle session trace no matter how the exception is handled in the calling PL/SQL code. In order to provide the ability for users to suppress (or enable) logs in the Oracle session trace, Java stored procedures (JSPs) should not throw exceptions.



The only way that appears to be left for JSPs to pass error conditions to their PL/SQL callers appears to


  1. store the error message in the session scope (e.g. a package variable such as jchem_core_pkg.last_error) and

  2. indicate through return values that an error has occurred.


If a JSP returns an error, the caller would raise a PL/SQL exception using the error message stored in the session's scope. This way the user will be able to turn off or on at her discretion the logging of JSP errors to the Oracle session trace file. (When turned on, the full exception will be logged as opposed to the shortened message included in the exception raised on the PL/SQL side. PL/SQL appears to trim error messages to 4000 characters, which is often not enough, if, for example, badly formed MRVs or SDFiles are to be made available somehow. For the same reason, some of the current error messages must be rearranged so that the reason comes first, then the structure second [if any], then the failing SQL third [if any].)



There is a potential problem with session-scoped variables: during execution of SQL queries containing calls to multiple JCC operators, the operators may be executed in an interleaved fashion (typically as part of two domain index scans) and if two or more operators fail on the Java side, one PL/SQL caller may trigger the abortion of the top level SQL call with the error message of the other failed call. While the error message will still be valid from the top-level perspective, it may be inconsistent with the PL/SQL stack presented in the final error message. In other words: the error message will appear in the wrong "(sub-)context". The implementation of the "error message passing" has to take care of this problem (by somehow associating the error messages with the corresponding index scan context where applicable). -- Or a more comfortable approach: we simply say that errors occurring while the index scan is already underway are unrecoverable and let JSP exceptions go through.



What should NOT be traced / logged?


  1. All errors due to invalid arguments input by the user. Error types directly attributable to invalid user input:

    1. Invalid structure format

    2. Invalid Chemical Terms expression

    3. Invalid search option

    4. Invalid option for JChem table DMLs


  2. "Fake errors" providing feedback on a condition to the user short of a better mechanism:

    1. Insertion of duplicate structures (when duplicate filtering is turned on)

    2. Indexing duplicate structures (when duplicate filtering is turned on)



 Regards,


Peter

ChemAxon aa7c50abf8

23-05-2010 22:56:48

I have edited my previous posting partly to "stream-line" its content, partly to add more thoughts to it.


It starts to resemble a rough specification which may not exactly be the "genre" for this forum. I intentionally left it so (after long deliberation), because I have the impression the issue under this topic is a bit like the problem of "what color the help button should be?". I don't really feel to have a good grasp on the real virtues of any of the possible answers. (Especially, that each answer I could come up with so far appear to be onerously expensive.)


Peter

User 9f6f294e9f

01-06-2010 08:34:49

Hi Peter


Thanks for your considered reply to what is clearly a difficult area. I'll now make some suggestions that you may (or may not !) find useful.


I suggest that we should try to distinguish between errors that we can reasonably anticipate as arising regularly and other error conditions. An example that falls in to the 'other' category may be a network error, and for those others I think it reasonable to raise a java exception. I think that you have alreadu identified the main user errors that we can anticipate in your previous post on this topic:


Invalid structure format
Invalid Chemical Terms expression
Invalid search option
Invalid option for JChem table DMLs


In terms of what we report back to the user I suggest that we look for a pragmatic solution. If there are mutiple input smiles then an indication that one of the smiles is invalid is good enough (and a lot better than a java exception). If it is possible to include the reason for the invalidity so much the better, but this shouldn't be seen as essential. There is no point in reproducing the whole of a lengthy SDFile in an error message. I would suggest that errors be limited to a sensible maximum - 200 characters perhaps ?


How you handle this via pl/sql isn't simple. Your suggestion of a package variable is a good one. Any pl/sql that is to include a check on the variable post a call to the java code would need to ensure that the variable was initialised at the start of the pl/sql block in question. I would say that it's reasonable to avoid difficult issues with interleaving by allowing exceptions to go though once index scans are underway.


I appreciate that the above proposals would involve some extensive code changes, but feel that the product would be better as a result. However, this is my opinion - you will no doubt wish to gauge other views before you decide whether or not to take any of of these suggestions forward.


Regards


Ant

ChemAxon aa7c50abf8

01-06-2010 14:35:05

Hi Ant,


Thank you for your thoughts. I think I can make some noticeable improvements along these lines.


An implementation detail: I decided to rework the internal interfaces on the JSP / PL/SQL boundary so that the error conditions are always unambiguously communicated through return values. And another: for structure format exceptions I will initially rely on the messages generated by the underlying format-handling core components, as they provide a reasonable amount of details most of the time.


Errors which are not logged in the Oracle session trace will not be logged on the JChem Server side either.


I expect the most often occurring error messages (about invalid structure format, invalid search option and exceptions indicating duplicate structures) with the most often used functionality (structure and similarity search, inserts/updates and indexing) to be covered by these improvements in the next minor JChem version. We can cover more error types and/or more functions as the need for them arises later.


Regards,


Peter

User 9f6f294e9f

01-06-2010 15:01:00

That all sounds great Peter. I do appreciate the effort involved - but think that the improvements will be well received and are worth the pain of tackling that hornets nest !


Regards


Ant

ChemAxon aa7c50abf8

19-07-2010 17:15:27

Hi Ant,


JChem 5.3.6 has been released with the improvement described earlier. Let me know if the result (of the improvement) meets your expectations and/or if you find that a similar solution should be extended to other type of errors.


Best regards,


Peter

User 9f6f294e9f

30-07-2010 14:00:09

I can confirm that 5.3.6 handles invalid smiles strings more cleanly, and now doesn't raise an uncaught java exception. Thankyou for this enhancement Peter.