Read from SD file using cartridge

User 952e1d9361

09-01-2010 09:13:44

Hello,


Is it possible to extract structure & attribute data from an SD File using the JChem cartridge?  Ideally I'l like to take an SD file that's in a CLOB and then iterate over the 'records' extracting data out one by one including the structure and the other data that may be in the file.


If it is not, can you recommend an approach that may help me to do this entirely inside the database?


Many thanks,


Steve H

ChemAxon aa7c50abf8

09-01-2010 10:51:28

Hello Steve,


Is it possible to extract structure & attribute data from an SD File using the JChem cartridge?  Ideally I'l like to take an SD file that's in a CLOB and then iterate over the 'records' extracting data out one by one including the structure and the other data that may be in the file.

It is currently not possible.


If you want to do it entirely in the database you can create an appropriate Java Stored Procedure using MRecordReader from the JChem API. The relevant code portion would look something like:


 


            MRecordReader recordReader = MFileFormatUtil.createRecordReader(inputStreamFromTheClob,

                    null);

            int recordCount = 0;

            MRecord rec = recordReader.nextRecord();

            while (rec != null) {

                // -- Structure

                String header = recordReader.getHeaderAsString();

                String footer = recordReader.getFooterAsString();

                StringBuffer source = new StringBuffer();

                if (header != null) {

                    source.append(header);

                }

                source.append(rec.getString());

                if (footer != null) {

                    source.append(footer);

                }

                String structure = source.toString();
// Use structure as appropriate



                // -- properties

                MPropertyContainer propContainer = rec.getPropertyContainer();

                String[] keys = propContainer.getKeys();

                for (String key : keys) {

                    String value = propContainer.getString(key);

// Use property key and value as appropriate
                }


Peter

User 952e1d9361

09-01-2010 16:31:39

That's great, thanks Peter for the quick reply and the sample code.


Tell me, are these classes already loaded into the jchem cartridge schema (JCHEM in my case) or do they needed to be loaded in via loadjava?  If the latter, then which jar(s) do I need?


Thanks again,


Steve

ChemAxon aa7c50abf8

09-01-2010 17:09:18

For this particular functionality, you need to load MarvinBeans.jar with loadjava. Note that recent Marvin/JChem versions require Java 5 or newer, which means that they will work with Oracle 11g or newer. As far as I can remember JChem/Marvin versions prior to 5.2 were compatible with Java 1.4, so those earlier versions are supposed to work with 10g as well.


Peter

User 952e1d9361

23-10-2011 21:16:40

Hello Peter,


As a follow up to this post from last year there is a now a requirment to test each MOL to see if it's 'valid'.  


Currently our code is based on your sample above to iterate over the records and extract the MOL and 'attributes' and then copy them into a temporary table which is then read using PL/SQL.


Occasionally however invalid MOL data is loaded which we would like to identify as soon as possible.  However as we are copying all the SD file records into this temporary table and then processing it in PL/SQL in the case of large SD files it can be some time before the 'bad' records is spotted.


So, to summarise, I am looking for some Java code that I can use to test the 'structure' variable in your code above for vailidity.  I am not sure how best to test it, is there some kind of 'isValid' function, or could I simply try to convert it to another format and see if it fails / succeeds?


Many thanks


Steve

ChemAxon aa7c50abf8

24-10-2011 11:00:55

Hi Steve,


A very basic kind of a check is to try and simply import the structure string:


            try {
                Molecule m = MolImporter.importMol(structure);
            } catch (Exception exception) {
                // handle problematic structure as appropriate
            }

Advanced checks can be performed using the various implementations of the StructureChecker interface. For example:



            Molecule m = null;
            try {
                m = MolImporter.importMol(structure);
                OverlappingBondsChecker obChecker = new OverlappingBondsChecker();
                StructureCheckerResult checkResult = obChecker.check(m);
                if (checkResult != null) {
                    // handle problematic structure as appropriate
                }
            } catch (Exception exception) {
                // handle problematic structure  as appropriate
            }



The "JChem Cartridge way" to do this is using jc_evaluate_x operator with the appropriate Chemical Terms checker function. For example:



select jc_evaluate_x('<structure-comes-here>',
'chemTerms:check("aromaticity..valence..queryAtom..queryBond")') from dual;



The available checker options which are accepted in the checker configuration string are described here: http://www.chemaxon.com/jchem/doc/user/structurecheck_cline.html#options .








Peter

User 952e1d9361

24-10-2011 14:29:01

Thanks Peter, that's exactly what I was looking for.


Steve