Processing of database fields

User 2e29183b3d

06-03-2007 18:20:42

Hi,





I have imported an SDF file, and one of the SDF-tags is let's say called "CLASS" and can have multiple discrete values out of {1,2,3,4, ...}. In the SDF file these classes are each one on a line. When the SDF file is imported, in Instan JChem they turn up as a various lines in a VARCHAR field.





Are there any functions to assess each of those lines? What if I wanted to extract these classes into new fields, class1, class2, class3? I found the chemical terms function fields('...') but this didn't even copy what was in the field to the new CT field.





I found searching with LIKE and %class% as query to find all molecules with a given class, but what if I want to find all molecules 1) in classX but NOT in classY, or 2) all molecules ONLY in classX?





Thanks a lot,





Florian

ChemAxon fa971619eb

06-03-2007 19:07:27

I don't think IJC currently has a simple solution to this. Some of the developments that are underway should be of use in the future. But in the meantime here are some tricks that might be worth a try.





1. The most obvious is to try to regenerate the SD file so that each of your terms is a separate field. If you make the field have approaprite values (e.g. true or false) then you can probably import them into a boolean field which would make them easily searchable.





2. Failing this, your best bet may be to create a new field for each of your values. Then search for each of the values in the original text field (using the LIKE operator and a expression such as %VALUE1%) to select only those rows that contain your particular value. Then select your appropriate new column to select all the rows matching your query, and then paste in a value (e.g. Y) into the selected cells. If you do this for each of your values then you would end up with a separate set of fields that can be all be used as part of a query.





Hopefully some variation on this theme might work for you. We'll look at the options for better solutions in future versions of IJC.





Tim

ChemAxon a3d59b832c

21-03-2007 09:08:11

Maybe you could write a Java program using the Marvin API to read an SDFile and write another one with the transformed sdf fields. You will need the following classes and methods:





http://www.chemaxon.com/marvin/doc/api/chemaxon/formats/MolImporter.html


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/Molecule.html


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/Molecule.html#getPropertyCount()


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/Molecule.html#getProperty(java.lang.String)


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/Molecule.html#setProperty(java.lang.String,%20java.lang.String)





You may also have a look at the examples: http://www.chemaxon.com/marvin/examples/index.html


Particularly SimpleConverter.





I hope this helps,


Szabolcs