nonstandard extensions for MRV

ChemAxon 587f88acea

22-02-2006 19:19:15

Hello,





The MOL format can be adapted with new extensions. JChem preserves information stored in nonstandard extensions when it stores the MOL file in a table. Is there a corresponding way to create nonstandard extensions in the MRV format? Will JChem preserve that information when the MRV file is placed in the table?





To be specific, I want to add an extension to MRV that denotes how many unshared electrons each atom has. (The number could be correct or incorrect.)





-- Bob

ChemAxon 9c0afc9aaf

23-02-2006 19:42:47

Hi Bob,





In the current implementation of JChem every non-standard attribute is removed during the import process, so this solutions would not work.





Our concept is that JChem should remove data fields from the input files, as they rather belong to database columns.


You can select which data fields got to which database columns.





This usually means cutting out a specific section of the input source, leaving the other parts in the original form (e.g. from SDFiles we keep the molfile part)


The current data field removal method for MRV however started out as a temporary solution, and all extra fields are also removed as it contains an import-export step.





We plan to improve on this method, an may also consider to make the data field removal an option.


(as compression for certain formats can already be disabled)





However, I would like to point out the disadvantages using non-standard extensions:


- Only your code will be able to make sense of this extra data - other programs will ignore it


- If you will need to process the molecule in some way, so a source -> Molecule object -> source conversion (import/export) takes place, you'll loose the information once again





I suggest to add your extra data as a molecule property (data field) in the


standard way. (e.g. a property named "UNSHARED_E" containing a string like "2 0 0 1 0" );





For a Molecule object you can call setProperty() :





http://www.chemaxon.com/jchem/doc/api/chemaxon/struc/Molecule.html#setProperty(java.lang.String,%20java.lang.String)





It will be stored in <property> tags in the MRV file.





During the import process you can instruct JChem to put this data into a custom column of the table.





Best regards,





Szilard

ChemAxon 587f88acea

23-02-2006 20:07:19

If we are creating an MRV file ourselves, we can use the <property> tag to denote a nonstandard property of the array of atoms, right? Like so:





<?xml version="1.0" ?>


<MDocument>


<MChemicalStruct>


<molecule molID="m1">


<atomArray


atomID="a1 a2 a3 a4 a5"


elementType="C H N H O"


formalCharge="0 0 0 0 -1"


x2="-5.666320878171964 -5.666320878171964 -7.0 -8.487525772485165 -4.332641756343929"


y2="2.4616667222976685 4.001666722297669 1.6916667222976685 2.090248051755551 1.6916667222976685"


/>


<bondArray>


<bond atomRefs2="a3 a1" order="2" />


<bond atomRefs2="a1 a5" order="1" />


<bond atomRefs2="a1 a2" order="1" />


<bond atomRefs2="a3 a4" order="1" />


</bondArray>


<propertyList>


<property title="unsharedElectrons">


<array datatype="integer">0 0 2 0 6</array>


</property>


</propertyList>


</molecule>


</MChemicalStruct>


</MDocument>





And JChem will preserve the unsharedElectrons array when it imports and exports the file from the database, right?

ChemAxon 9c0afc9aaf

24-02-2006 12:00:45

Quote:
And JChem will preserve the unsharedElectrons array when it imports and exports the file from the database, right?
Not necessarily.


You should add an extra column to your JChem table where you will store these values.


The import process removes the data fields, but puts the values into the specified DB column.


During export these data columns can be added to the exported molecule again.


(see my previous post for more detail)
Quote:
If we are creating an MRV file ourselves, we can use the <property> tag to denote a nonstandard property of the array of atoms, right? Like so:
For the process above you should use "scalar" properties, so it can be imported to the DB as a data field.


(properties of this kind also appear in Marvin View).


Your data should be in a single string, e.g. "1 2 3 0 3".


Example:


Code:



<propertyList>


        <property dictRef="Name1" title="Name1">


           <scalar>blahblah</scalar>


        </property>


        <property dictRef="unsharedElectrons" title="unsharedElectrons">


           <scalar><![CDATA[1 2 3 0 3]]></scalar>


        </property>


</propertyList>






Please note, that you may have to use CDATA if your string contains spaces or other characters that are treated special in XML (e.g. "<", ">")





Best regards,





Szilard