Structure Checker in PL/SQL with exceptions for abbrevGroup

User 05d9866f9b

17-06-2013 16:31:21

I've learned that I can defined standardizing rules with exclusion :
<Sgroups ID="ungroup" Act="ungroup" Exclude="NH2,SH,COOH"/>


First question:
There is somewhere an example like
<Sgroups ID="ungroup" Act="ungroup" Exclude="NH2,SH,COOH,MyGroup"/>
but I don't know how/where to define my own exclusions


Second question:
Is there a documentation /examples how to define this ungroup with structure checking means within PL/SQL
I now that I can e.g. use for structure checking/fixing


(select id,
(jc_evaluate_x(structure,'chemTerms:check("abbrevGroup")')) as check_msg,
jc_evaluate_x(structure,'chemTerms:fix("abbrevGroup->expandgroup")') as fixed_struc,
structure as org_structure
from test_my_structs)

but I would like to define exceptions here as well and couldn't find anything in the documentation.


Is it possible to define this for structurechecking/fixing too?


Thanks for your help

ChemAxon afdac7b783

18-06-2013 13:44:49

Dear Edith,


In case of Standardizer actions: Contract S-groups, Expand S-groups, and Ungroup S-groups, you can define an exclude list for specific S-groups, even custom ones.  
These actions assume that the molecular structure contains an S-group having a name (or title). http://www.chemaxon.com/marvin/help/sketch/sketch-basic.html#abbreviatedgroups


If you create an S-group and name it, for example "Poppy", you should use this name in the exclude list (see attachments).


standardize -c "sgroups:ungroup:exclude='Poppy'" contracted-poppy.mrv  -f mrv

 
In JChem Cartridge, Structure Checker configurations works via actionstrings, i.e, the "expand all abbreviated groups except Poppy", structure checker action string configuration will be the following: 


"abbrevgroup:excluded=Poppy->expandgroup"


Using this configuration in check mode, the checker will not find the specific "Poppy" group; Moreover, in fix mode, the fixer will not modify the "Poppy" group, since the fixer will change the checker result only (groups that are found by the checker).


So within a PL/SQL, simply insert the "abbrevgroup:excluded=Poppy->expandgroup" into the jc_evaluate_x expression. 
http://www.chemaxon.com/marvin/help/structurechecker/structurechecker_examples.html#jcc 


Best Regards,


Viktoria

User 05d9866f9b

18-06-2013 21:14:29

Dear Viktoria - thanks for the response - helped
I  can work now with e.g.


select  jc_evaluate_x(structure,'chemTerms:fix("abbrevGroup:exclude=''NH2,SH,COOH,Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val''->expandgroup")') as fixed_struc from .....


But


when I use the example with the s-Group (abbrevGroup:exclude=Poppy->expandgroup" ....)  in my PL/SQL statement, where is this userdefined sGroup stored in the database?
It has to be stored somewhere in the database when I want to use the check e.g. in a trigger, right?

What irritates me a bit as well is that I can use - within the statement - not existing groups and I don't get any error message (see example - there is no s-Group with the name "GOODMORNING". I see a "passed" as a result - but I would expect an error message / exception.

select jc_evaluate_x(structure,'chemTerms:check("abbrevGroup:exclude=''GOODMORNING''")')) as check_msg, from ....



Perhaps I do have a basic misunderstanding and it will be easier to have a phonecall. Thanks.

ChemAxon afdac7b783

19-06-2013 11:50:11

I'm not sure that I understand your concern regarding the
not existing groups in the exclude list clearly, but I have two assumptions:


1) Would you like to get an error message if the list of
excluded groups contains a “not existing” group name?


The aim of the abbreviation group checker is to find any created S-groups by name in the molecular structure. The exclude list is an option,
and the checker will not consider the listed groups during checking.


Since custom abbreviations are also included in this
"excluded" option, the checker cannot throw any error or exception on
a “not existing” group name, because there is no group and name that cannot be
created by the user. (How should it know that you do not have a custom
"GOODMORNING" group among your custom groups?)


2) Would you like to get an error message if the checker
does not find the listed group name during checking?


Take an example: You have a collection of functionalized
tripeptides and you do not want to check the amino acids only the endgroups.


You will configure the abbreviated group checker as the
excluded list will contain e.g., 21 amino acids.


During checking, the checker will throw errors in case of
all molecules, since they do not contain 18 of the 21 listed amino acids.


 


My colleague will answer the database related question soon.


 


 


We will contact You to discuss these and any other problems
You have.


 


Best regards, 


VIktoria

ChemAxon aa7c50abf8

19-06-2013 12:38:20

Dear Edith,


when I use the example with the s-Group (abbrevGroup:exclude=Poppy->expandgroup" ....)  in my PL/SQL statement, where is this userdefined sGroup stored in the database?
It has to be stored somewhere in the database when I want to use the check e.g. in a trigger, right?

The sgroups are currently not stored separately as part of the index data.


select  jc_evaluate_x(structure,'chemTerms:fix("abbrevGroup:exclude=''NH2,SH,COOH,Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val''->expandgroup")') asfixed_struc from ....


JC_EVALUATE_X operates on the standardized form of the structure stored as part of the index data. As the standardized form is stored in ChemAxon Extended SMILES format which doesn't support all sgroup types, you currently have to use the JCF.EVALUATE_X function for all types of groups to be included:


declare
cursor c1 is select structure from test_my_structs;
f clob;
begin
for t in c1
loop
f := jcf.evaluate_x(t.structure,'chemTerms:fix("abbrevgroup:excluded=Poppy->expandgroup") outFormat:mrv');
-- Use the fixed structure to your liking, e.g. print it out: dbms_output.put_line(f);
end loop;
end;

Peter

User 05d9866f9b

05-07-2013 14:27:17

Hi Peter - thanks for the hint - but it still doesn't work:


You can see the exclusions below:("abbrevGroup:exclude=''NH2,SH,COOH,Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val''")


We've used the jcf_evaluate_x functionbut the COOH is still expanded in my example (see attached hardcopy)
The structure is standardized with aromatize/basic; removeExplicitH and tautomerize

What is wrong with the exclusion - statement - what is the reason for still expanding this group?


Thanks - Edith


 


 


 

ChemAxon aa7c50abf8

05-07-2013 14:32:24

Hi Edith,


Please, could you post


1. the full SQL,


2. the input structure and


3. the expected output?


Thanks,


Peter

User 05d9866f9b

10-07-2013 09:48:39

Hi Peter, Please see structure and expected behavior in the attached file. Thanks for your support.

ChemAxon aa7c50abf8

10-07-2013 10:56:17

Hi Edith,


The following SQL works for me (JChem 5.12.4):


select jcf.evaluate_x('
Mrv0541 07101312422D

11 11 0 0 0 0 999 V2000
1.6500 -0.4714 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.9355 -0.8839 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.9355 -1.7089 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.6500 -2.1214 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.3645 -1.7089 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.3645 -0.8839 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.2742 0.7385 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.3119 -0.0857 0.0000 C 0 0 2 0 0 0 0 0 0 0 0 0
0.3830 -0.5304 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-1.0444 -0.4651 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0821 -1.2893 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 6 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
4 5 1 0 0 0 0
5 6 1 0 0 0 0
9 2 1 0 0 0 0
7 8 1 0 0 0 0
8 9 1 6 0 0 0
8 10 1 0 0 0 0
10 11 2 0 0 0 0
M STY 1 1 SUP
M SAL 1 5 7 8 9 10 11
M SBL 1 1 7
M SMT 1 Ala
M SAP 1 1 9
M SAP 1 1 10
M SCL 1 CXN
M END', 'chemTerms:check("abbrevGroup:exclude=''Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val''")') from dual;


It gives: 


failed, Abbreviated Group Checker: 1 abbreviated group found


Peter

User 05d9866f9b

10-07-2013 13:00:45

Hi Peter - perhaps we do have a misunderstanding We defined Ala as an exclusion that means we do expect a passing for this structure checker because we have defined Ala as one exclusion. 

ChemAxon aa7c50abf8

11-07-2013 11:31:52

Hi Edith,


The exclusion option should have a "d" at the end :


abbrevGroup:excluded


Peter

User 05d9866f9b

11-07-2013 17:02:21

Hi Peter - in my environment there is no difference in the behavior between exlude or excluded.


But I realized another difference. When I check the standardized structure (SMILE String from the DIX table - the check always pass.


But it even pass if I don't define Ala in the exclusion list: Please see the details in the attached document  


And if you don't see the same behavior it might be the best to set up a short webmeeting.


 

ChemAxon afdac7b783

12-07-2013 11:48:55

 


Hi Edith,


If the input molecule is in SMILES, the Abbreviated Group Checker will always return "passed" as result: SMILES format does not support abbreviated groups.


Test: Copy your SMILES string (CC(NC1=CC=CC=C1)C=O |c:5,7,t:3|) into a MarvinSketch. You can see that Ala group is not present in the structure as an abbreviated group (it is ungrouped).


Peter is investigating the problem related to the molstructure input and the usage of  "exclude" or "excluded" list; However, the expected behavior is:


When "exclude" is used, it should not consider any exclusion; only "excluded" is accepted in Structure Checker to define exclusions.


Correct "excluded" list: 


$ evaluate -e "check('abbrevGroup:excluded=Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val')" input_with_Ala.mrv
failed, Abbreviated Group Checker: 1 abbreviated group found

$ evaluate -e "check('abbrevGroup:excluded=Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val')" input_with_Ala.mrv
passed

Incorrect "exclude" list:


$ evaluate -e "check('abbrevGroup:exclude=Ala,Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val')" input_with_Ala.mrv
failed, Abbreviated Group Checker: 1 abbreviated group found

$ evaluate -e "check('abbrevGroup:exclude=Arg,Asn,Asp,Cys,Glu,Gln,Gly,His,Ile,Leu,Lys,Met,Phe,Pro,Ser,Thr,Trp,Tyr,Val')" input_with_Ala.mrv
failed, Abbreviated Group Checker: 1 abbreviated group found

Best regards,


Viktoria

ChemAxon aa7c50abf8

12-07-2013 16:07:06

Hi Edith,


Indeed, the check result fails with both "exclude" and "excluded" with JChem version 5.12 and 6.0. It returns the expected "pass" with version 6.1 which is currently in pre-alpha stage.


Peter

User 05d9866f9b

13-07-2013 21:32:10

Hi Peter - is there an overview what is fxed in 61 ? Thanks Edith

ChemAxon aa7c50abf8

15-07-2013 10:16:10

Hi Edith,


Such a list in this context is cutting across multiple Chemaxon projects/products. I notified the Release Manager of your request, who will reply to you soon.


Peter

ChemAxon 6a002a76a4

15-07-2013 11:25:40

Dear Edith,


With every releases we create a changes.html where we post our fixes for the given release, but I'm absolutely aware of that you know about this. At this time we are in the middle of the testing phase of the Relase 6.1 and do not have this changes page. But if you want we can look after you problem with structure checker, what has changed from 6.0 to 6.1 and inform you as soon as possible.


Regards,


Roland

ChemAxon afdac7b783

22-07-2013 15:11:52

Hi Edith, 


 The relevant changes from
version 6.0 to 6.1 is related to the handling of quotation marks around Structure
Checker
's excluded list.


Version 6.0


 In Structure Checker, the
abbreviated group checker accepts the excluded list of groups, 
but these groups should not be
placed between single quotes. 


When an excluded list starts
and closes with single quotes, e.g.,
"abbrevgroup:excluded='Ala,Thr,Gln'", the first and last
elements of the list are not taken into account, because the checker is
looking for groups in the default abbreviated group list such as 'Ala and
Gln'. Only Thr will be excluded in the above case. 


Version 6.1


You can put single quotes
around the excluded list.  The following examples will work in 6.1 and
exclude Ala, Thr, and Gln groups as well: 


"abbrevgroup:excluded=Ala,Thr,Gln->contractgroup"


"abbrevgroup:excluded='Ala,Thr,Gln'->contractgroup"


"abbrevgroup:excluded="Ala,Thr,Gln"->contractgroup"


Best regards, 


Viktoria

User 05d9866f9b

26-07-2013 13:25:49

thanks for the information - that is helpful.


BTW - the mail notification still doesn't work - so it was rather by accident that I've seen your message. I've addressed this already several times - would be great to fix this :)

ChemAxon aa7c50abf8

13-09-2013 17:15:47

Hi Edith,


JChem version 6.1 has been released including a fix for the issue of incorrect checker results mentioned in my post above.


Best regards,


Peter