search in non standardize DB

User dfeb81947d

14-12-2005 15:57:20

Dear Support,





You told me once that for a StructureSearch using JChemSearch, the query is cleaned (aromatized, ...).


If I have a database which is not standardize I will miss sone hit.


But for aromatic ring?


If my database contains rings as C1=C-C=C-C=C1 will they be find if my query is c1ccccc1 ?





Thank you for your help





Warmest Regards


Jacques

ChemAxon 9c0afc9aaf

14-12-2005 17:55:56

Dear Jacques,
Quote:
You told me once that for a StructureSearch using JChemSearch, the query is cleaned (aromatized, ...).
Actually "standardized", not "cleaned".


"Cleaning" usually refers to calculating 2D or 3D coordinates for a structure, which is not needed for JChemSearch.
Quote:



If I have a database which is not standardize I will miss sone hit.


But for aromatic ring?


If my database contains rings as C1=C-C=C-C=C1 will they be find if my query is c1ccccc1 ?
Yes, you will find them of course.


During the database search both the query and target structures are always standardized.





The target structures are automatically standardized during import.


The query is standardized at the start of the search.


The same standardization rule is used in both case, which is a table-specific setting.





The table can use either default or custom standardization.


The default standardization is mainly aromatization.





For more a complicated standardization (e.g. bringing nitro groups to a standard form) you may need custom standardization.


This allows you to define your own rules in an XML file (requires Standardizer license).





Please see the administration guide on how to set the standardization at table creation:





http://www.chemaxon.com/jchem/doc/admin/#create





You can change the standardization rule of existing tables with regeneration:





http://www.chemaxon.com/jchem/doc/admin/#regener





Best regards,





Szilard

User dfeb81947d

16-12-2005 10:12:54

Dear Szilard,





Thank you for your help.


It seems that between the version 3.1.3 and 3.1.4 there are modification for import. Now the standardization is automatic and I need dom4j.jar to be added.


Is it a new features of jchem 3.1.4?





By the way, how could I do to make a custom standardization during import with UpdateHandler?


DO I need to standardize the import mol before or is there a link?





Now I have a small problem with some structures:
Quote:



java.lang.ArrayIndexOutOfBoundsException: -1


at chemaxon.marvin.modules.Parity.putToTheTop(Parity.java:2383)


at chemaxon.marvin.modules.Parity.setupStereoBonds(Parity.java:796)


at chemaxon.marvin.modules.Parity.setParity(Parity.java:427)


at chemaxon.marvin.modules.Parity.modfunc(Parity.java:167)


at chemaxon.struc.MoleculeGraph.setParity(MoleculeGraph.java:1442)


at chemaxon.struc.MoleculeGraph.setParity(MoleculeGraph.java:1411)


at chemaxon.marvin.modules.Hydrogenize.implicitizeHydrogens(Hydrogenize.java:265)


at chemaxon.marvin.modules.Hydrogenize.callback(Hydrogenize.java:51)


at chemaxon.struc.MoleculeGraph.callHydrogenize(MoleculeGraph.java:545)


at chemaxon.struc.MoleculeGraph.hydrogenize(MoleculeGraph.java:483)


at chemaxon.reaction.Standardizer.performAction(Standardizer.java:615)


at chemaxon.reaction.Standardizer.standardizeComponent(Standardizer.java:1296)


at chemaxon.reaction.Standardizer.standardize(Standardizer.java:1371)


at chemaxon.jchem.db.TableInfo.standardize(TableInfo.java:1092)


at chemaxon.jchem.db.UpdateHandler.init(UpdateHandler.java:762)


at chemaxon.jchem.db.UpdateHandler.execute(UpdateHandler.java:1409)


at chemaxon.jchem.db.UpdateHandler.execute(UpdateHandler.java:1388)
Are the structures bad?








Warmest Regards


Jacques

ChemAxon 9c0afc9aaf

16-12-2005 11:21:30

Dear Jacques,
Quote:



It seems that between the version 3.1.3 and 3.1.4 there are modification for import. Now the standardization is automatic and I need dom4j.jar to be added.


Is it a new features of jchem 3.1.4?
These are not new features.


The standardization was always automatic in JChem.


Also, dom4j.jar is required since about JChem 2.3.


Due to some minor modifications the absence of this file might be noticeable more frequently though.
Quote:
By the way, how could I do to make a custom standardization during import with UpdateHandler?
To use custom standardization, you have to set an XML Standardizer configuration for the table in JChemManager.


(please see the link in my previous post)
Quote:
DO I need to standardize the import mol before or is there a link?
No.


After setting the configuration for the table in jcman, you do not have to do anything, the structures are automatically standardized according to the rule both during import and during search.





Regarding the exception:





This seems to be a bug concerning the removal of explicit H atoms.


This is also apparent in MarvinSketch (Edit->Remove->explicit H atoms).


It will be fixed in the next Marvin release, and the next JChem version which contains this new Marvin version.


Many thanks for the bug report.





Best regards,





Szilard

User dfeb81947d

16-12-2005 16:27:03

Dear Szilard





Thank you for your quick reply.
Szilard wrote:
Quote:
By the way, how could I do to make a custom standardization during import with UpdateHandler?
To use custom standardization, you have to set an XML Standardizer configuration for the table in JChemManager.


(please see the link in my previous post)
I did as you mentionned above, but I was thinking about a batch application I have developped that make the importation with specific conditions and from a scheduled task.


That's why I use UpdateHandler.





So I have too way of proceeding:


1) I standardize the molecule (structure) before doing


uh.setValuesForFixColumns(id, structure)


2) I import all the molecule and standardize manually with JChemManager using regeneration. (which might take long for only a few new molecules inserted).





I was wondering if there where not a method in Updatehandler that allows to set a xml file for standardization.





I will choose the first method.





Thank you for your help


Have a nice week-end.





Warmest Regards


Jacques

ChemAxon 9c0afc9aaf

16-12-2005 16:33:07

Hi,





UpdateHandler automatically uses the XML that you have set in JChemManager, so you do not have to specify the configuration for UpdateHandler and you do not have to regenerate the table either.





Also, you do not have to explicitly standardize the molecule of course.





Best regards,





Szilard

User 86810cf9fa

04-01-2006 15:20:41

Dear Szilard,





As I follow Jacques' Job, I have a new question.





I would like to make a custom standardization when molecules are imported into the JChem Structure table.





For that purpose I have a computer dedicated which run a java batch file doing importation through JChem API (UpdateHandler).


Is it possible to specify for example the path of my file 'standardize.xml' into the ‘.jchem’ file located in the folder 'chemaxon' in the user root directory?


Or is it necessary to manually make the standardization on the Molecule Object using the chemaxon.reaction.Standardizer class (and doing it on each molecule that should be imported)?





When I run JChemManager with my own computer and standardize the whole table with custom standardization (xml file), the information regarding the rules of standardization are not saved, and each time I need to standardize the table I need to select the XML file again.





I am using jchem 3.1.4





Thank you for your help.


Best regards,

ChemAxon 9c0afc9aaf

04-01-2006 18:16:50

Hi,





In JChem every table can has an associated XML configuration for custom standardization. The content of the XML configuration is saved in the database.


After specifying it once the standardization is transparent: you do not have to standardize either the input molecules nor the queries. Our tools automatically utilize this standardization (both applications and API).





This means, that after the import you do not have to "standardize" the table, as standardization was automatically performed during the import process.





To determine the current standardization rule for a table type:





Code:
jcman t <table_name>






This writes out the current standardization configuration (or states the table uses default standardization), followed by the table structure.





Also note that the standardization has no effect on the appearance of the structures, it only affects the search process.


Visualization of structures uses the cd_structure column which always stores the input structures unchanged.


This allows to


- view the structures in the original form (e.g. Kekule form of aromatic rings)


- change the standardization later.








I assume knowing all this you will need to locate the XML much less often (if I understood your question correctly): only at the creation of a new table or when changing the standardization rule.


Regardless, we should remember the location of the last file indeed, so the file open dialog can point there next time.


We plan to implement this in the future.


On the long run we are also planning to implement the storing named configurations in the DB, so when creating a new table the user can select from previously uploaded rules.





I hope I could answer your question, please let me know if something needs more clarification.





Best regards,





Szilard

User 86810cf9fa

10-01-2006 10:44:39

Dear Szilard,





I really thank you for your answer.


I wonder where this information is stocked. I didn't found it in the database.





Very thank you.


Severine

ChemAxon 9c0afc9aaf

10-01-2006 11:08:05

Hi,





This setting is stored in the property table (e.g. "JChemProperties").





Example for the property name:





Code:
table.SCOTT.MYTABLE.standardizerConfig






Please note that it's not recommended to modify the property table by non-JChem tools.





Best regards,





Szilard

User 86810cf9fa

13-01-2006 09:40:33

Thank you very much Szilard.





Best regards,


Severine