Creating Markush structures using structure tables

User 173254b396

17-06-2010 09:59:58

Hi,


In JChem is there a way to create Markush structures from structures stored in structure tables? For example, I have the R groups in one table and the scaffolds in an other. I would take a scaffold and using a selection mechanism add the R groups from the R group table and the created Markush structure is stored in a Markush table. Do I wish too much?


Cheers, Péter

ChemAxon fb166edcbd

17-06-2010 16:36:06

Although there is no way to do this directly in the database, you can export your structures to files and then compose the scaffold and the R-group definition members into a Markush structure using molconvert. Use the -R[rid]  (e.g. -R3) parameter. Example:


molconvert mrv scaffold.mrv -R3 r3def.mrv -o rg.mrv

There is a possibility to filter the fragments with one/two attachment point(s):


molconvert mrv scaffold.mrv -R3:1 fragments.mrv -o rg.mrv

User 173254b396

16-09-2010 15:33:33

Thanks a lot.


I still have one question. How do you create the fragments.mrv file with the attachment points?  


Cheers, Péter

ChemAxon a3d59b832c

16-09-2010 22:18:22

Hi Péter,


 


One way is to prepare reagent files by Reactor. This way a leaving group of a synthetic reaction can be replaced by an attachment point. For example, the attached reaction scheme will replace a Cl atom.


 


Another way is to directly the API. MolAtom.setAttach()


( http://www.chemaxon.com/jchem/doc/dev/java/api/chemaxon/struc/MolAtom.html#setAttach%28int%29 )


can be used for that.


 


Best regards,


Szabolcs

User 173254b396

17-09-2010 15:22:35

Thanks again.


In cxsmiles-doc I found:


Attachment point information


Atomic indexes of the attachment points written after "AP_x:" where x denotes the attachment point type (1 or 2), separated by commas.
Example: "AP_1:10,AP_2:3 "


I tried to import  'CCO |AP_1:3|' with MarvinSketch I get the following message:


Cannot read molecule 1


It means i cannot use cxsmiles to manipulate attachment points?



Cheers, Péter


ChemAxon a3d59b832c

20-09-2010 11:04:23

Hi Péter,


 


The atom numbering for the extended smiles string begins with 0, so 3 was out of range in your example.


Try this: CCO |AP_1:2|


 


Please note that we are currently rewriting the attachment point representation for version 5.4. It may affect cxsmiles IO as well. (At least it means that more than 2 attachment points will be available per R-group.)


 


Best regards,


Szabolcs

User 173254b396

20-09-2010 12:58:00

Thanks it works now. Péter

User 173254b396

01-11-2010 10:39:16

Hi,


If you run molconvert to create Markush structures from scoffolds and fragments it cannot run if the fragment file name contains the drive letter with column. Molconvert takes the column as the begining of the range definition and and tries to interpret the rest of the filename as range:


molconvert mrv C:\CCEI2377.mrv -R1 C:\CCEI2377_1.mrv -R2
C:\CCEI2377_2.mrv -o C:\CCEI2377_markush.mrv
C:\vp\Az\VirtualLibrary\jchem\100909\CCEI2377.mrv: error: For input string: "CCEI2377_2.mrv"
java.lang.IllegalArgumentException: Illegal range format: CCEI2377_2.mrv
        at chemaxon.util.IntRange.parseRange(Unknown Source)
        at chemaxon.util.IntRange.<init>(Unknown Source)
        at chemaxon.formats.MolConverter.fuseFragments(Unknown Source)
        at chemaxon.formats.MolConverter.convert0(Unknown Source)
        at chemaxon.formats.MolConverter.convert(Unknown Source)
        at chemaxon.formats.MolConverter.main(Unknown Source)
Caused by: java.lang.NumberFormatException: For input string: "CCEI2377_2.mrv"
        at java.lang.NumberFormatException.forInputString(Unknown Source)
        at java.lang.Integer.parseInt(Unknown Source)
        at java.lang.Integer.valueOf(Unknown Source)
        at java.lang.Integer.decode(Unknown Source)
        ... 6 more


Regards, Péter

ChemAxon fb166edcbd

04-11-2010 12:50:31

This is a bug in MolConverter in handling the molecule range string: ':' is treated as a separator character between the fragment file name and the molecule index range to be added as R-group member. The problem arises each case when the file name itself contains a ':' character, even in Linux.


Examples:


molconvert mrv scaffold.mrv -R1:1 r1:xx.mrv > o.mrv
molconvert mrv scaffold.mrv -R1:1 r1:xx.mrv:1-2 > o.mrv

In the latter case we want to add only the first and the second molecule from the r1:xx.mrv fragment file.


I have fixed this for the upcoming 5.4 release. It will be checked if the string after the last ':' character is a valid range string and only in that case will it be treated as range string.


Thanks for the report and sorry for the late answer.