compressed MRV

User 870ab5b546

04-09-2006 01:14:17

Do you have a compressed form of the MRV format like you do compressed MOL?

ChemAxon 7c2d26e5cf

04-09-2006 10:37:23

No, currently there is not a compressed version of MRV (like compressed MOL).


Would you like it?

User 870ab5b546

04-09-2006 11:11:23

Yes, please.

ChemAxon 7c2d26e5cf

04-09-2006 12:57:58

We have added your suggestion to our feature request list.

ChemAxon a3d59b832c

05-09-2006 07:51:54

AFAIK, MolImporter supports gzipped files. (You can also open these from msketch and mview.)





Is this suitable for your needs?

User 870ab5b546

05-09-2006 10:08:54

It may. Rafi had a similar idea, though I don't think he knew MolImporter handled them. Do you know if the gzipped string might have embedded ' characters, and, if so, would there be a problem handing it to Oracle?

ChemAxon 9c0afc9aaf

05-09-2006 11:29:24

Quote:
Do you know if the gzipped string might have embedded ' characters
The gzipped data can contain any kind of bytes, even outside the "normal" character set, therefore it is not suitable to be represented as string (neither in SQL or inside a web page).





May I ask what is your most important reason for compression ?





Best regards,





Szilard

ChemAxon 7c2d26e5cf

05-09-2006 11:46:39

Quote:
The gzipped data can contain any kind of bytes, even outside the "normal" character set,
Because of it, use base64 encoding for gzipped files/streams:


http://www.chemaxon.com/marvin/doc/user/base64-doc.html


Please see the following example where the molecule is in base64 gzipped format to be transfered to a JSP.


http://www.chemaxon.com/marvin/doc/dev/example-noliveconnect1.html

User 870ab5b546

05-09-2006 12:21:28

We need to store MRV responses in an Oracle database, and apparently we do it by passing a string to Oracle via SQL. According to Rafi:
Quote:
The problem is a limit of 4K characters in a string literal in an oracle SQL statement. Even though the relevant field is of type LONG (which means arbitrarily long string), the literal can't be too long. One workaround is to compress and then feather the MRV, using java.util.zip and then sun.misc.BASE64Encoder.encode.
When we draw a mechanism, with rectangles, graphical arrows, and electron-flow arrows, the MRV file very rapidly becomes very long. We recently tried to store a file that was 4.8K long.

ChemAxon 9c0afc9aaf

05-09-2006 12:49:09

Dear Bob,





The compression in this case would be an unsafe solution, because you may also exceed the 4000 limit even in compressed form in extreme cases.





There is no need to send the structure to the database using a string literal, and there is no practical limit to the size of data in JDBC.





I suggest you to study a JDBC tutorial and the Oracle reference.





Some tips:


- BLOB or CLOB columns are the best for sorting data with arbitrary size


- there are more ways to write LOB columns, but not all of them works with large data size, you should follow the recommendations by Oracle.





You could spare the JDBC code improvements by storing the MRV sources in a JChem table using UpdateHandler.


(JChem is prepared for very large input structures)


The JChem structure table can also be handy if later you'll have to search these structures for some reason.


Of course you can add your data columns to the JChem table.


Isn't this latter solution convenient for you ?





Best regards,





Szilard

User 870ab5b546

05-09-2006 13:39:18

We are somewhat averse to increasing our dependence on the JChem tables because we don't like having to regenerate them every time JChem is updated and because regeneration is irreversible. But thanks for the suggestions; Rafi thinks the CLOB may be the way to go.