jcf.molconvert extra line spaces

User 773d472e7f

28-01-2014 11:59:12

Dear ChemAxon


I was previously informed of a a bug in jc_molconvert:


https://www.chemaxon.com/forum/ftopic10802.html&highlight=


I have now upgraded to JCHEM 6.1


It seems now I need to use jcf.molconvert (my previous case was with jc_molconvert)


jcf.molconvert produces extra line breaks within the molfile with this type of usage:


select jcf.molconvert(CTAB_CLOB,'mol') into v_ctab from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER; 


 


environment: Windows Server 2012 64 bit | Oracle 11.2 | JChem Cartridge 6.1


Please advise if this is a bug and if so when a fix might be due.


Many thanks


Bernard

ChemAxon abe887c64e

29-01-2014 08:58:54

Hi Bernard,


I tried to reproduce this error, but  found that the akos_sample.sdf file (attached to the cited forum question) isn't a valid sdf file because it has "Invalid header block". So, I think the superfluous lines come from the original structure file.


Please, check it.


Best regards,


Krisztina


 

User 773d472e7f

29-01-2014 15:06:25

Dear Krisztina,


Thanks for your reply.


The example sdf file from the cited forum topic is the RESULT of running this statement:


select jc_molconvert(CTAB_CLOB,'mol') into molfile from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;


This new forum question is reporting the effect from the related function jcf.molconvert.


select jcf.molconvert(CTAB_CLOB,'mol') into molfile from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;


To be clear : the molconvert is operating on data from oracle (CTAB_CLOB) and producing a molfile. 


The resulting molfile that is produced by the molconvert function contains the extra line breaks.


The original molfile did not contain these odd line breaks.


 


Kind regards


 


Bernard

ChemAxon abe887c64e

29-01-2014 17:31:34

Hi Bernard,


Thank you for the explanation. Would you select a structure from the table without conversion? We would like to ensure that really the jc_molconvert function who inserts the extra line breaks.


Krisztina

User 773d472e7f

30-01-2014 08:06:43

Dear Krisztina,


I have run the procedures as requested:


once simply extracting the CTAB_CLOB:


select ctab_clob into v_ctab from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;


once requesting the molconvert function:


select jcf.molconvert(CTAB_CLOB,'mol') into v_ctab from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;


The full PLSQL procedure is enclosed as well as the resulting text file output from these procedures.


Kind regards


 


Bernard

ChemAxon d9cc14700b

30-01-2014 14:49:52

Hi Bernard,


 


Thanks for the additional information. Unfortunately, I am still unable to reproduce the issue. Can you clarify the following things:


1. You mentioned that the JChem Cartridge runs on Windows Server 2012 64 bit. Does that mean that the JChem server also runs on that machine?


2. In which format does the ctab_clob column contain molecules (mol, sdf)?


3. In the output file create without conversion, I see that they contain CR/LF line terminators. Can you confirm that this is the original line terminator?


4. And last, is there any Linux/UNIX machine invloved in your workflow, which could somehow mix up the line terminators (with the additional linux/unix specific CR)?


Thanks and Regards,
Gabor

User 773d472e7f

18-02-2014 09:35:00

Hi Gabor,


Apologies for this delayed reply on this, other priorities got in the way. However this remains an issue for me for the production of SDF export files.


This command is run only on windows exactly with the code that was attached to my previous post , the code that produces the SDF text file is reproduced in full. It simply uses the oracle UTIL_FILE procedures to write out the data to the operating system file. The relevent command is:


UTL_FILE.PUT_LINE(v_file,


                        v_ctab ||


                        '>  <PUBCHEM_EXT_DATASOURCE_REGID>' || CHR(10) ||


                        cur_rec.AKOS_NUMBER || CHR(10) || CHR(10) ||


                        '>  <PUBCHEM_EXT_DATASOURCE_URL>' || CHR(10) ||


                        'http://www.akosgmbh.de/AKosSamples/index.html' || CHR(10) || CHR(10) ||


                        '>  <PUBCHEM_EXT_SUBSTANCE_URL>' || CHR(10) ||


                        'http://akosservice.de/akossamplesservice/php/akossamplesretrieval.php?IDNUMBERS=' || cur_rec.AKOS_NUMBER || CHR(10) || CHR(10) ||


                        '>  <PUBCHEM_DEPOSITOR_RECORD_DATE>' || CHR(10) ||


                        v_date || CHR(10) ||


                        '$$$$'


                        );


the variable v_ctab is loaded with either


(a) a direct select on the CLOB column, in which case it produces a normally formed molfile output.


	select ctab_clob into v_ctab 
from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;

(b) a call to the molconvert function, in which case extra line feed appear in the output.


	select jcf.molconvert(CTAB_CLOB,'mol') into v_ctab 
from akos_moltable where AKOS_NUMBER = cur_rec.AKOS_NUMBER;


 


This should be easy to reproduce:


 



  1. Create a simple regular (not JChem) molecule table with a CLOB column and possibly an ID column.

  2. Insert some molefile data into the CLOB Column

  3. create an index on the clob column with : CREATE INDEX AKOS_JC_MOLIX ON AKOS_MOLTABLE(ctab_clob) INDEXTYPE IS jchem.jc_idxtype PARAMETERS('haltOnError=nf' );

  4. run a version of the plsql script against the clob column to output the data into text file via the UTIL_FILE procedures. 

  5. Compare the results with the usage of method (a) and (b) above.


kind regards


 


Bernard


 


 


 


              


 



ChemAxon 61b4fee994

19-02-2014 13:14:32

Hi,


We could reproduce your case. jcf.molconvert puts new line characters to the string as cr+lf (chr(13) and chr(10), because Oracle is running in a Windows environment. However, Oracle databases have a strange implementation of UTL_FILE.PUT, because it treats this end of line character combination as two end of lines. So it will replace it as cr+lf cr+lf. That's why you see the double new line characters. I couldn't find a way to change this behavior of Oracle, and we don't want to change the implementation of JChem functions (as we feel they do the right thing), we can only suggest workarounds.


The easiest is to remove the cr characters from the string after conversions:


replace(jcf.molconvert(CTAB_CLOB,'mol'),chr(13),null)


Another method to do this is to rely on put_raw instead of put and put_line:


- use UTL_FILE.FOPEN with wb parameter instead of w


- use UTL_FILE.PUT_RAW(v_file,UTL_RAW.CAST_TO_RAW(<your_molecule and strings>));


Best regards,


Tamas

User 773d472e7f

19-02-2014 16:20:03

Hi Tamas,


OK, Thank you for your advice, I will work around this one.


best regards


Bernard.