I am trying to enumerate more than 200 scaffolds, molecule cores with 3 functionalization sites, with variable sets of functional groups. Currently I have scaffold files with R groups defined within them; however, this is unwhieldy and error prone as the user must copy paste or draw many functional groups for each scaffold. It would be convienient if there were a method to make scaffold files and functional group files then enumerate the scaffold files with the functional group file.
Is there a batch method to define a scaffold file and enumerate it with a functional group file? Is this the best approach or is there another recomended way to acomplish my goal?
I have attached a truncated example of a current scaffold file for clarification.
I'm not sure the embeded Markush Enumeration plugin in MarvinSketch is the best solution for my problem. As I understand it using the embeded plugin would require me to copy paste my R1, R2, and R3 groups into 200+ scaffold files. What I was hoping for is a batch solution which would allow me to save a list of functional groups with attachements as R1, R2, R3 then simply use those saved files to enumerate the scaffold files.
I have found that I can fuse functional groups into files:
molconvert sdf Scaffold.sdf -R1 R1.sdf -o Enumeration_File.sdf
I have attached example Scaffold.sdf, R1.sdf, and resulting Enumeration_File.sdf. The problem I have here is that I can not have more than one R group attachment in the R1.sdf file. Could you help me to figure out how use the molconvert fuse command to make enumeration files that have more than one molecule in the R1 position? Or could you reccomend another method to enumerate 200+ scaffolds with a specific set of funcional groups in each position?
Well done, you have already found the key tool to create the Markush structure from the scaffold and R-group definition files!
The problem I have here is that I can not have
more than one R group attachment in the R1.sdf file. Could you help me
to figure out how use the molconvert fuse command to make enumeration
files that have more than one molecule in the R1 position?
You can have multiple R-group definitions in R1.sdf, as separate records.
Or could you
reccomend another method to enumerate 200+ scaffolds with a specific set
of funcional groups in each position?
In fact, molconvert is able to process a multi-record sdf as the scaffold input as well.
In summary, you can use a command-line like:
molconvert sdf multiple_scaffolds.sdf -R1 multiple_R1.sdf -R2 multiple_R2.sdf -o multiple_Markush.sdf
This way the R-group definitions will be merged into each scaffold from the input.
Furthermore, you can also access the Markush enumeration step from command-line as well, via cxcalc.
If you would like to see the Markush file: multiple_Markush.sdf, check out Markush Viewer: