In JChem is there a way to create Markush structures from structures stored in structure tables? For example, I have the R groups in one table and the scaffolds in an other. I would take a scaffold and using a selection mechanism add the R groups from the R group table and the created Markush structure is stored in a Markush table. Do I wish too much?
Although there is no way to do this directly in the database, you can export your structures to files and then compose the scaffold and the R-group definition members into a Markush structure using molconvert. Use the -R[rid] (e.g. -R3) parameter. Example:
molconvert mrv scaffold.mrv -R3 r3def.mrv -o rg.mrv
There is a possibility to filter the fragments with one/two attachment point(s):
molconvert mrv scaffold.mrv -R3:1 fragments.mrv -o rg.mrv
Thanks a lot.
I still have one question. How do you create the fragments.mrv file with the attachment points?
In cxsmiles-doc I found:
Attachment point information
Atomic indexes of the attachment points written after "AP_x:" where x denotes the attachment point type (1 or 2), separated by commas.
Example: "AP_1:10,AP_2:3 "
I tried to import 'CCO |AP_1:3|' with MarvinSketch I get the following message:
Cannot read molecule 1
It means i cannot use cxsmiles to manipulate attachment points?
The atom numbering for the extended smiles string begins with 0, so 3 was out of range in your example.
Try this: CCO |AP_1:2|
Please note that we are currently rewriting the attachment point representation for version 5.4. It may affect cxsmiles IO as well. (At least it means that more than 2 attachment points will be available per R-group.)
Thanks it works now. Péter
If you run molconvert to create Markush structures from scoffolds and fragments it cannot run if the fragment file name contains the drive letter with column. Molconvert takes the column as the begining of the range definition and and tries to interpret the rest of the filename as range:
molconvert mrv C:\CCEI2377.mrv -R1 C:\CCEI2377_1.mrv -R2
C:\CCEI2377_2.mrv -o C:\CCEI2377_markush.mrv
C:\vp\Az\VirtualLibrary\jchem\100909\CCEI2377.mrv: error: For input string: "CCEI2377_2.mrv"
java.lang.IllegalArgumentException: Illegal range format: CCEI2377_2.mrv
at chemaxon.util.IntRange.parseRange(Unknown Source)
at chemaxon.util.IntRange.<init>(Unknown Source)
at chemaxon.formats.MolConverter.fuseFragments(Unknown Source)
at chemaxon.formats.MolConverter.convert0(Unknown Source)
at chemaxon.formats.MolConverter.convert(Unknown Source)
at chemaxon.formats.MolConverter.main(Unknown Source)
Caused by: java.lang.NumberFormatException: For input string: "CCEI2377_2.mrv"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.valueOf(Unknown Source)
at java.lang.Integer.decode(Unknown Source)
... 6 more
This is a bug in MolConverter in handling the molecule range string: ':' is treated as a separator character between the fragment file name and the molecule index range to be added as R-group member. The problem arises each case when the file name itself contains a ':' character, even in Linux.
molconvert mrv scaffold.mrv -R1:1 r1:xx.mrv > o.mrv
molconvert mrv scaffold.mrv -R1:1 r1:xx.mrv:1-2 > o.mrv
In the latter case we want to add only the first and the second molecule from the
r1:xx.mrv fragment file.
I have fixed this for the upcoming 5.4 release. It will be checked if the string after the last ':' character is a valid range string and only in that case will it be treated as range string.
Thanks for the report and sorry for the late answer.