question about custom standardization upon import/search

User f52820d97e

11-07-2006 12:55:08

Hi,


I would like to have some precisions about the standardization procedure in JChem Base (forgive me if I don't ask in the right forum, I wasn't sure...)


As described in http://chemaxon.com/jchem/doc/admin/index.html#standardization upon import (of an sdf file for ex.) with jcman one can specify standardization options to store the structures (cd_smiles) in the tables.


From what I understood during the UGM (but I am maybe mistaken), then any query (substructure or similarity) is then used against the stored standardized form, but the original sdf structure is displayed for the results. Is that correct? This is also what I understand in
Quote:
The tasks specified as optional in the Standardizer configuration file are only performed on the molecules imported into the database, but not on the queries.
which means that the default behavior


Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->





<StandardizerConfiguration>


    <Actions>


        <Aromatize ID="aromatize"/>


        <Dehydrogenize ID="dehydrogenize" Optional="true"/>


    </Actions>


</StandardizerConfiguration>



would aromatize on import and query, but would dehydrogenize only on import?


It would be useful for me, since I have two separate tables for each set of molecules, one with salts and one with the molconvert -F option to remove salts, since I was afraid a similarity search would end up badly when salts are present. A custom standardization would eliminate the necessity for 2 tables...


A last point to be sure: is
Code:
<Removal ID="keepOne" Method="keepLargest" Measure="atomCount"/>
the equivalent of the molconvert -F option (which works well for the type of structures I have)?


Thank you,


Nicolas

ChemAxon 9c0afc9aaf

11-07-2006 13:25:12

Hi,
Quote:
... upon import (of an sdf file for ex.) with jcman one can specify standardization options to store the structures (cd_smiles) in the tables.
Actually not upon import, but upon table creation.


If you want to specify / change this for an existing table, you can do it in the regenerate menu.
Quote:



From what I understood during the UGM (but I am maybe mistaken), then any query (substructure or similarity) is then used against the stored standardized form, but the original sdf structure is displayed for the results. Is that correct?
Yes, and the query is also standardized automatically at the start of the search according to the standardization rule of the table.





Yes, the mentioned configuration will only dehydrogenize the imported (target) structures. This is because explicit H atoms are important for queries, but not for structures.


(this reduces the graph size of targets)
Quote:
A custom standardization would eliminate the necessity for 2 tables...
Yes, it seems to be a natural solution for your case.





And finally: yes the mentioned Standardizer action is equivalent with molconvert -F.





Best regards,





Szilard

User f52820d97e

11-07-2006 13:49:53

Hey Szilard, did I tell you you were my hero? ;-)


Joke apart, thanks a lot for this. It is going to help me a great deal.


I obviously meant upon table creation, since the option does not exist when importing...


One last point: if you dehydrogenize the imported (target) structures; what happens if you explicitely specify a query structure with an hydrogen (that you don't dehydrogenize, then), it would never reach the target then? Or am I completely off?


Cheers,


Nicolas

ChemAxon 9c0afc9aaf

11-07-2006 14:15:46

Quote:
One last point: if you dehydrogenize the imported (target) structures; what happens if you explicitly specify a query structure with an hydrogen (that you don't dehydrogenize, then), it would never reach the target then? Or am I completely off?
There will be a hit, because the target contains implicit hydrogens in this case (calculated from valence).


Explicit and implicit hydrogens behave in exactly the same way in the target structures.





Please also see the "Explicit hydrogens" section of our Query Guide:





http://www.chemaxon.com/jchem/doc/user/Query.html#explH





Regards,





Szilard

User f52820d97e

11-07-2006 14:21:06

Great, I get it! Forgive my naive ignorance...


Nicolas