filter out functional groups

User 6b1e802ce9

18-11-2010 03:46:56

We want to filter out of one of our chemical databases and remove some undesired functional groups from it.  How can we accomplish the task using the chemaxon tools?

ChemAxon d76e6e95eb

22-11-2010 11:19:32

Filtering can be done with substructure searching in JChem databases, probably, Instant JChem is the easiest destop application for that, although, if you prefer command line, you can use jcsearch.


If you want to convert all functional groups of a type to something else, you might use Standardizer. It has a wizard-like graphical application and command line tool as well.


If you prefer to convert functional goups in a synthetically feasible way, I would suggest Reactor.


For more details about the applications and their use, please visit the specific product documentations:
http://www.chemaxon.com/products/

User 3575925344

17-12-2010 21:57:31

Hello,


 


I wrote for such purpose the following batch file that reads out the SMARS patterns from a config file (SMARTS_FILTER.cfg in my case), following extraction of hits by the list of respective ID numbers that passed the filter is better to perform using SDF_toolkit (a Perl based pacage), JChem classes are too slow for large datasets.


The batch file based on the original ChemAxon evaluate.bat file with addition of processing multiple SMARTS strings. Unfortunately ChemAxon classes do not allow to read a list of query strings from a file.


The batch creates SDF files with structures that contain undesirable functional groups and/or lists of their ID's - I used idnumber for describption of an ID field, if yours is different you need to change it either in your SDF to idnumber or to modify the code accordingly.


Please see attached batch script and config files for Windows, please note it requires to have extended command on (cmd /E:on and uses findstr command).


 


best regards,


Lex

ChemAxon d76e6e95eb

21-12-2010 14:48:24

I suppose, that some of your statements are based on misunderstandings:


JChem classes are too slow for large datasets

That sounds strange, please see the benchmarks here:
http://www.chemaxon.com/jchem/doc/admin/Performance.html 


Unfortunately ChemAxon classes do not allow to read a list of query strings from a file.

Reading a list of query strings is certainly supported by ChemAxon classes. Probably not by some command line tools.


Thank you for your modified batch file.

User 3575925344

24-12-2010 01:43:00

Oh, I'm sorry, indeed my point was all about command line implementations to work with standalone SD files - a somewhat simpler way for occasional tasks rather than making scripts for handling a JChem database.


However, I did not find any option to set up a batch query (e.g. a list of undesirable functional groups) within the InstantJchem dabase manager - did I miss something or InstantJchem is not supposed to handle all JChem functional possibilities?


 


Thanks for the correction,


 


Lex

ChemAxon fa971619eb

25-12-2010 16:18:45

If using Instant JChem your best bet is probalby to use the Overlap analysis function.
http://www.chemaxon.com/instantjchem/ijc_latest/docs/user/help/htmlfiles/chemistry_functions/performing_overlap_analysis.html


Create a table of your undesireable functional groups and one for your structures and run an overlap analysis with these two tables.  Also, you might find it handy to use a table type of 'Query structures' for the undesireable functional groups (see section on 'Tab1: General settings').
http://www.chemaxon.com/instantjchem/ijc_latest/docs/user/help/htmlfiles/editing_database/editing_entities.html


Tim

User 3575925344

25-12-2010 21:39:20

Thanks Tim,


 


It is really marvelous that ChemAxon provides so many various options to takle huge structure datasets in a most apropriate way for each specific task!


 


Very appreciate your suggestion,


 


Lex