aromatize acts like aromatize/basic for some heterocylics

User 39d0b79643

22-01-2007 16:42:13

The following Standarizer Configuration file worked in 3.1.7.





<StandardizerConfiguration Version ="0.1">


<Actions>


<Action ID="aromatize" Act="aromatize"/>


<Action ID="dehydrogenze" Act="dehydrogenize"/>


<!-- put pre-prediction standarization reactions here -->


<!-- the paths the the .mrv files are relative to the this file -->


<Reaction ID="deprotonate carboxyls" Structure="carboxyl_deprot.mrv"/>


</Actions>


</StandardizerConfiguration>





However, starting with 3.2 and continuing through 3.2.3,


many heterocyclic compounds are standardized as if


<Aromatize ID="aromatize"/>


were instead


<Aromatize ID="aromatize" Type="basic"/>





Using caffeine, the first example used in


http://www.chemaxon.com/jchem/doc/user/Standardizer_files/examples/Examples.html





processCompForm: in smiles=CN1C=NC2=C1C(=O)N(C(=O)N2C)C


processCompForm: standardized mol=Cn1cnc2n(C)c(=O)n(C)c(=O)c12





Both rings in the caffeine molecule are aromatized, instead of only the five membered ring, as desired, and as shown in the Examples.html


page for <Aromatize ID="aromatize"/>.





More examples can be provided if needed.





Thank you for any assistance you can provide.





- Lynda Ellis

ChemAxon e08c317633

23-01-2007 15:55:47

Hi,





From JChem 3.2 the default aromatization type has changed to "general" (see JChem Changes). It seems, the standardizer examples haven't been updated yet, and they also do not show correctly the JChem 3.1 state. Sorry for this mistake, we will update them.





General (also named "Daylight" in versions before JChem 3.2) type aromatization returns this structure (using caffeine): Cn1cnc2n(C)c(=O)n(C)c(=O)c12


Basic type aromatization returns this structure (using caffeine): CN1c2ncn(C)c2C(=O)N(C)C1=O





The deafult aromatization is "general" type aromatization since JChem 3.2. Before JChem 3.2 the default aromatizatin type was "basic".





Related documentation: http://www.chemaxon.com/marvin/doc/user/aromatization-doc.html





Note: Since JChem 3.2 Reactor doesn't require the input molecules to be in the same "aromatized state" as the reactants are in the reaction scheme. Reactor will find an aromatized input reactant even if the reactants are not aromatized in the reaction scheme.





Best regards,


Zsolt

User 941c2467a3

25-01-2007 05:48:18

Zsolt wrote:
Hi,





From JChem 3.2 the default aromatization type has changed to "general" (see JChem Changes). It seems, the standardizer examples haven't been updated yet, and they also do not show correctly the JChem 3.1 state. Sorry for this mistake, we will update them.





General (also named "Daylight" in versions before JChem 3.2) type aromatization returns this structure (using caffeine): Cn1cnc2n(C)c(=O)n(C)c(=O)c12


Basic type aromatization returns this structure (using caffeine): CN1c2ncn(C)c2C(=O)N(C)C1=O





The deafult aromatization is "general" type aromatization since JChem 3.2. Before JChem 3.2 the default aromatizatin type was "basic".





Related documentation: http://www.chemaxon.com/marvin/doc/user/aromatization-doc.html





Note: Since JChem 3.2 Reactor doesn't require the input molecules to be in the same "aromatized state" as the reactants are in the reaction scheme. Reactor will find an aromatized input reactant even if the reactants are not aromatized in the reaction scheme.





Best regards,


Zsolt
Thank you, Zsolt. Your suggestion is very helpful!


We modified our standardizer configuration file as,


<Action ID="aromatize" Act="aromatize" Type="basic"/>


and it works well with caffeine:





before standardized,


CN1C=NC2=C1C(=O)N(C(=O)N2C)C


after standardized,


CN1c2ncn(C)c2C(=O)N(C)C1=O





But, there is an additional question related to the "UpdateHandler" class. While we get the desired SMILES string (eg, caffeine, CN1c2ncn(C)c2C(=O)N(C)C1=O) from the standardizer, and want to store it to the database.





In our code, we use the "UpdateHandler" class to handle the update operation on the fixed columns of the compound table, where the standardized SMILES string, as a parameter, is sent to its method "setStructure()". For example, setStructure("CN1c2ncn(C)c2C(=O)N(C)C1=O").





After the "UpdateHandler" is executed, I query the cd_smiles field of the compound table in MySQL db. The SMILES string is "Cn1cnc2n(C)c(=O)n(C)c(=O)c12", rather than "CN1c2ncn(C)c2C(=O)N(C)C1=O", which we want.





I'd like to know that if the "UpdateHandler" does something on the SMILES format convertion, and if we could change some options on the "UpdateHandler" to store the SMILES strings as the format we want.





Thanks!





Jeff

ChemAxon 9c0afc9aaf

25-01-2007 16:39:40

Hi Jeff,





Standardization in JChem is solved transparently.


Each table can have a custom standardizer configuration, which is automatically utilised during the insert, import or update of the structures.





Please see the following sections in the documentation:





http://www.chemaxon.com/jchem/doc/user/Query.html#standardizationDB





http://www.chemaxon.com/jchem/doc/admin/#create





There are two important things to note:





1. There is no need to explicitly pre-standardize the structures before insert / upgrade or search, the appropriate standardization is always performed, and the search results will be correct.





2. The standardization only affects the cd_smiles field (which is used for searching), the cd_structure field (used for display) will not be changed.





So this is what happens in your case:


- You pre-standardize the structure in some way before passing it to UpdateHandler


- The is a different custom or default standardization set for the structure table, and the structure is standardized again according to this during import.





You should change the standardization rule of the table to use the desired configuration. This can be done in the Regenerate menu of jcman (JChemManager).





There is no need to explicitly standardize the structures, unless you also want them displayed in some standardized form.





1. You can use any standardizer configuration to transform the structures on-the-fly during display, even different ones for each user according to their preferences. This will not affect the search.





2. You may continue to pre-standardize the structures before insert, but I do not recommend this solution because


- you would always have to make sure the two configurations are the same


- if there will be some changes / improvements in your standardizer configuration or in our standardizer code, your structure in the database is already altered, and you cannot use the original structures during regeneration.





Best regards,





Szilard