unusual match since 3.1.4

User dfeb81947d

23-12-2005 15:11:24

Dear Support,





Using Jchem 3.0.14 to query a database without standardization, I was able to find, with chrysene as query (see query.jpg), two molecules (see SDFile) with JChemSearch with default option.


Now with JChem 3.1.4, I standardize the CSMOL table for structure search as so:


Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->


<StandardizerConfiguration Version ="0.1">


    <Actions MappingStyle="changing">


   <Reaction ID="nitro" Structure="[O-:2][N+:1]=O>>[O:2]=[N:1]=O"/>


   <Reaction ID="azide" Structure="N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]"/>


   <Reaction ID="ammoniumhalide" Structure="C[N+:1][H:2].[F,Cl,Br,I;-:3]>>C[N:1]"/>


   <Reaction ID="enamine" Structure="[H:4][N:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[N:3]"/>


   <Reaction ID="enol" Structure="[H:4][O:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[O:3]"/>


   <Reaction ID="phosphate" Structure="[O-:2][P:1]>>[O:2][P:1]"/>


   <Reaction ID="phosphate" Structure="[O-:2][P:1]>>[O:2][P:1]"/>


   <Reaction ID="ammonium" Structure="[N+:1][H:2]>>[N:1]"/>


   <Reaction ID="demoinsise" Structure="[O-:2][C:1]>>[O:2][C:1]"/>   


   <Aromatize ID="aromatize"/>


   <Dehydrogenize ID="dehydrogenize"/>


   <Sgroups ID="ungroup" Act="ungroup"/>


    </Actions>


</StandardizerConfiguration>






Now, with chrysene, I don't find anymore the two structures.


Where is the problem according to you?





Is the standardization wrong?


Or should the query be more specific?





Thank you for your help.


Warmest Regards, and have a nice week-end.


Merry Christmas.





Jacques

ChemAxon d76e6e95eb

23-12-2005 18:22:13

Just move the aromatize action to the top and the enol tautomerization will not destroy then the aromaticity of the rings.

ChemAxon 9c0afc9aaf

28-12-2005 13:04:49

Hi,





The





Code:
<Dehydrogenize ID="dehydrogenize"/>






tag should probably also be amended.





Explicit H atoms usually should not be removed from the query structure, because they can make a difference in the search.


For example if a user draws a carbon with 3 explicit H atoms, he/she probably does not want to find a carbon in a cyclohexane, where only 2 implicit H atoms are present.


However it's useful to remove them from the target structures, because it reduces the graph, therefore increases speed and reduces the length of the cd_smiles and the memory footprint of the cache (without altering search results)





For this dual behavior the "optional" attribute should be used.


If it is set to "true", the action is only performed for target structures.





Furthermore, I recommend the "ImpH" action instead.





So the recommended way of removing explicit H atoms is the following:





Code:
<ImplH ID="dehydrogenize" Optional="true"/>






Best regards,





Szilard

ChemAxon 9c0afc9aaf

28-12-2005 16:27:47

PS:
Quote:
<ImplH ID="dehydrogenize" Optional="true"/>
This is a workaround for a bug in 3.1.4 concerning the removal of wedged H atoms.


(ImplH only removes them if you specify it explicitly as an attribute.)





Szilard

User dfeb81947d

04-01-2006 09:47:07

thank you very much everybody.


changing the deshydrogenation and changing the order of the jobs is working perfectly.





Warmest Regards,


Jacques