Stereochemistry with JChem Web Services

User 779e37e0e6

20-01-2015 20:04:36

Hi,


I have configured a jchem database with query structures. I have set the "table.C_PATTERNS.absoluteStereo" flag to false. This should take the stereochemistry into account when defined. I am running superstructure search on it, but despite trying many options, I get results that are not correct.


My search options are these:


:'searchOptions' => { :'queryStructure' => smiles, 


                                     :'searchType' => "SUPERSTRUCTURE", :stereoSearchType => "ON", :absoluteStereo => "CHIRAL_FLAG" },


 


I have tried also without specifying the ":absoluteStereo" options, since it was already set to false in the database. I also tried with :absoluteStereo => "TABLE_OPTION", and did not get any difference in the results.


When I search for patterns contained in the target : CCCSC[C@H](N)C(O)=O, which is a S-alkyl-L-cysteine, I get also the attached structure back, which is for D-alpha amino acid. The smarts string I use for L-alpha amino acid is the llowing, and it is correctly returned as a hit.


-------------------------------------


  Mrv0541 01201512502D          


 


  6  5  0  0  0  0            999 V2000


    0.1956   -0.2513    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0


    0.6078    0.4631    0.0000 C   0  0  2  0  0  0  0  0  0  0  0  0


    1.4328    0.4632    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0


    1.8453    1.1775    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0


    1.8453   -0.2512    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0


    0.0245    1.0465    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0


  2  1  1  1  0  0  0


  2  3  1  0  0  0  0


  3  4  2  0  0  0  0


  3  5  1  0  0  0  0


  2  6  1  0  0  0  0


A    6


AH


M  MRV SMA   1 [#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*~[#7,#8,#15,#16])]


M  MRV SMA   2 [C;X4]


M  MRV SMA   3 [C;X3]


M  MRV SMA   4 [O;X1]


M  MRV SMA   5 [#8;A;X2H1,X1-]


M  END


-------------------------------------



Curiously, the structure CNCC(=O)O, which does not have any stereo, returns both l-alpha amino acid and d-alpha amino acid. This suggests that the stereochemistry info of the patterns in the jchem database is not taken into account during the search operation.


Could you please help me out here?


Thans


Regards,


Yannick

ChemAxon 13811e1703

23-01-2015 14:29:13

Hi Yannick,

Stereochemistry has its own group in the searchOptions like this:


"searchOptions": {
    "queryStructure": "CCCCCC",
    "searchType": "SUPERSTRUCTURE",
    "stereoChemistry": {
        "stereoSearchType": "ON",
        "absoluteStereo": "CHIRAL_FLAG"
    }
}



Regards,
Peter 

User 779e37e0e6

25-01-2015 19:27:04

HI Peter,


Thanks for replying. I made the changes, but it did not change the results. I still get both L- and D- amino acids, which is incorrect.


Regards,


Yannick

ChemAxon abe887c64e

26-01-2015 13:30:44

Hi Yannick,


Setting table option "absoluteStereo" to true (default) results that all structures with tetrahedral stereochemistry will be considered as absolute (even if they have no chiral flag).


Setting table option "absoluteStereo"  to false results that the molecules without "chiral flag" stored in that table will be considered as racemic, the molecules with "chiral flag" will be considered as absolute.


The interpretation of the above table option can be overwritten in a search with the setAbsoluteStereo search option.


Additionally, there are further search options which affect the results of stereochemical searches. Eg.: setStereoSearchType. This page contains description and links to JChem API docs relating stereochemical features.


So, we recommend to set the "absoluteStereo" table option to true, in order distinguish eg., L- and D-amino acids, (or set the table "absoluteStereo" to false and apply absolute_stereo_always_on as AbsoluteStereo search option).


If you want to distinguish L-amino acid and amino acid with undefined configuration (even if the undefined one is in the query role), you have to apply  Stereo_Exact as stereoSearchType.


Best regards,


Krisztina

User 779e37e0e6

26-01-2015 18:45:06

Hello Christina,


 


I have made the changes twice


1) I set absoluteStereo to false in the table, and the following options in my code:


options = {
                :'searchOptions' => { :'queryStructure' => smiles,
                                     :'searchType' => "SUPERSTRUCTURE", :stereoChemistry => {:stereoSearchType => "EXACT", :absoluteStereo => "ALWAYS_ON"} },
                :display => { :include => [ 'cd_id', 'name' ] },
                :paging => { :offset => 0, :limit => 100000 }

              }


 


2) I set absoluteStereo to true in the table, and the following options in my code:


options = {
                :'searchOptions' => { :'queryStructure' => smiles,
                                    
:'searchType' => "SUPERSTRUCTURE", :stereoChemistry =>
{:stereoSearchType => "EXACT"} },
                :display => { :include => [ 'cd_id', 'name' ] },
                :paging => { :offset => 0, :limit => 100000 }

              }


 


I have not re-imported the structures into the database. I just changed the value of the options.


 


Unfortunately, there is no change in the results. I still get both l-amino acid, and d-amino acid, as well as amino-acid as results. It does not matter if the structure I submit for superstructure search contains stereochemistry information or not. If it does not contain stereochemistry, it should not return any hit with stereochemistry information (e.g.: D-/ L- amino acid, etc...). If it does contain stereochemistry information, it should only return the correct hit (either D-amino acid or L-amino acid).


Somehow, it does not work. I am wondering if it has something to do with the version I am using (14.10.20.0).


This is very urgent for me and your continuous help would be very appreciated.


Regards,


 


Yannick

ChemAxon abe887c64e

26-01-2015 20:48:11

Hi Yannick,


I tested with JChem Base v14.10.20 and found the expected search results in all cases if I applied the queries in molfile form. See the attached doc.


If the queries were applied in smiles form, there were missing hits in two cases: when table had  "absoluteStereo false"  settings and the query was L- or D-alanine and "stereoSearchType exact" was applied. We will further examine these cases.


Would you check the absoluteStereo table settings in your JChemproperties table? Was the modification from false to true really executed?


Best regards,


Krisztina

User 779e37e0e6

26-01-2015 22:53:16

Hi Krisztina,


I have changed the asbolute stereo option to true in my database table. The latest version I used was 14.12.8.0. Sorry for the confusion. The queries are applied in smiles format.


Regards,


Yannick

User 779e37e0e6

26-01-2015 23:30:15

 



Hi Krisztina,


Thanks for replying. I have changed the asbolute stereo option
to true in my database table. The latest version I used was 14.12.8.0.
Sorry for the confusion. The queries are applied in smiles format.


From the document you sent, I think the absolute stereo should be set to true and the stereoSearchType should be the default (is this the same as stereoSearchType => "ON" ?).


Regards,


Yannick


ChemAxon abe887c64e

27-01-2015 13:14:55

Hi Yannick,


The results I wrote yesterday were produced with a query table filled with simple amino acid structures (without query atoms and any other query properties). Now I tested the stereosearch behavior with D-alpha-amino-acid.mol you sent and I could reproduce the false hits.


The source of the problem is that the structure D-alpha-amino-acid.mol contains an AH atom near the stereocenter. Our core code doesn't calculate the configuration (R or S) of stereocenters  to which such query atoms are connected. Therefore, we cannot execute the comparison during the search processes according to your needs


Yes, StereoSearchType: ON is the default.


Best regards,


Krisztina

User 779e37e0e6

28-01-2015 06:58:19

Hi Krisztina,

Thanks for replying. I removed the AH group from the L-/ and D- amino structures. The problem I have now is that when I import one of the resulting molecules into my structure table, the other is not imported. It looks like JChEm cannot differenciate between the two when there is  no group attached to the alpha C-atom.

Regards,

Yannick

ChemAxon 13811e1703

28-01-2015 08:59:16

Hi Yannick,

There is an option for the JChem tables called duplicateFiltering, in JChem Web Services import it is false by default  Can you check this option in your database in JCHEMPROPERTIES table?

Regards,
Peter 

User 779e37e0e6

28-01-2015 18:19:42

Hi Peter,


Thank you for replying. The duplicate filtering has been set to true. We have always had it set to true, even before we had the problem I described in my previous e-mails of this topic.


Regards,


Yannick

ChemAxon abe887c64e

02-02-2015 15:19:30

Hi Yannick,


Substituting AH with a carbon atom would not be a convenient solution for you?


Best regards,


Krisztina

User 779e37e0e6

22-04-2015 04:34:15

Hi Krisztina,


I made some changes that helped. Thaks for your help.


Yannick

User 779e37e0e6

10-11-2015 22:38:12

Dear all,


 


In regard to this very topic, I have set the absoluteStereo option to True for supersearch. I also create two SMARTS patterns for L- and D-alpha-amino acids.


D-alpha amino acid: [#6]-[#6@@H;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*~[#7,#8,#15,#16])][*,#1])-[#6;X3]([#8;A;X2H1,X1-])=[O;X1]


L-alpha amino acid: [#6]-[#6@H;X4]([#7;A;X3,X4+;!$([N]~[!#6]);!$([N]*~[#7,#8,#15,#16])][*,#1])-[#6;X3]([#8;A;X2H1,X1-])=[O;X1]


For the following compounds, I get the correct results  C[C@@H](N)C([O-])=O (D-lapha amino acid), C[C@H](N)C([O-])=O (L-lapha amino acid).


However, for some compounds, I get both D- and alpha amino acids. E.g.: NC(C(O)=O)C(O)=O, and D-isovaline (CC[C@@](C)(N)C(O)=O).


Surprisingly, for the following compound, I do not get any of the D- or L-alphamino acid pattern, but just alpha-amino acid. NC(CC(N)C(O)=O)C(O)=O

ChemAxon abe887c64e

11-11-2015 09:34:32

Dear MrYan,


I try to explain why you the unexpected results received.


However, for some compounds, I get both D- and alpha amino acids. E.g.: NC(C(O)=O)C(O)=O, and D-isovaline (CC[C@@](C)(N)C(O)=O).


The molecule NC(C(O)=O)C(O)=O is hit because the central carbon atom of this molecule - which corresponds to the chiral (R) carbon atom the of the D-alpha amino acid pattern -  is not an asymmetric carbon atom, it has lost his asymmetry because of the substituents on the methyl group. In case of structures without tetrahedral chirality, JChem search does not check the configuration.





Surprisingly, for the following compound, I do
not get any of the D- or L-alphamino acid pattern, but just alpha-amino
acid. NC(CC(N)C(O)=O)C(O)=O



Your results are surprising also to us. The D-alpha amino acid pattern should give the following hits (because they contain at least one (R) center):


N[C@H](C[C@@H](N)C(O)=O)C(O)=O


NC(C[C@@H](N)C(O)=O)C(O)=O


N[C@@H](C[C@@H](N)C(O)=O)C(O)=O


Should not hit:


NC(CC(N)C(O)=O)C(O)=O


NC(C[C@H](N)C(O)=O)C(O)=O


N[C@@H](C[C@H](N)C(O)=O)C(O)=O


Have not you applied any other stereo search specific option? Only absoluteStereo True ?


Best regards,


Krisztina

User 779e37e0e6

11-11-2015 17:59:38

Hi Kiriztina,


 


I set the folowing option: absoluteStereo True.


Thank you.


MrYan

User 779e37e0e6

11-11-2015 18:05:43

Hi Kriztina,


 


Based on the results in the stereosearc.docx document attached in your post from Mon Jan 26, 2015 9:48 pm on this very topic, I would expect that with my setting, the molecule NC(C(O)=O)C(O)=O should not be a hit neither from D- nor from L-alpha amino acid.


 


Moreover, can really classify a compound botth as a L- and a D-alpha amino acid in this very case (based onthe same amino acid group)?


Best,


MrYan

ChemAxon abe887c64e

12-11-2015 13:13:31

Hi MrYan,


Would you summarize all the conditions, please, because I could not reproduce the same results as you:



We provide an additional search option influencing the stereo search results, the stereoModel, Its default value is Comprehensive. You can check how it works with Local or Global settings (NC(C(O)=O)C(O)=O won't be given back).


Regarding your question


Moreover,
can really classify a compound botth as a L- and a D-alpha amino acid
in this very case (based onthe same amino acid group)?


We do not classify compounds, only provide tools, search options, and execute the searches. There are options which give back an achiral compound as hit to a chiral query (which is substructure of the achiral compound).


Best regards,


Krisztina