Mol import difference between mol:Usg and mol:Xsg

User 02c7249dc6

25-05-2010 19:47:48

from the docs on file formats (http://www.chemaxon.com/marvin/help/formats/mol-csmol-doc.html#ioptions )


For the section on Mol imports there are several options including:


Xsg - Expand all S-groups.


Usg - Ungroup all S-groups


 


Can you comment a bit on the difference between these options.


When reading in a mol file with something like pyrole saved in aromatized form.


using :


Molecule molecule = MolImporter.importMol(input, "mol:Usg");


leaves the molecule without a hydrogen on the nigrogen and with broken aromaticity.


using:


Molecule molecule = MolImporter.importMol(input, "mol:Xsg");


reads in the molecule as expected , round tripping on aromaticity works fine:


 


If we add an fmoc group to this pyrole, the fmoc is expanded correctly with both Usg and Xsg but with Usg again the hydrogen is left off the nitrogen and the aromaticity is broken.


i've attached 2 mol file with pyrole and pyrole with fmoc.


Thanks in advance for clarification on this.


Dan

ChemAxon a3d59b832c

25-05-2010 20:37:33

Hi Dan,


I moved this topic over to the Marvin forum, which is the area for the file format related questions, too.


My colleagues will check this and answer soon.


 


Best regards,


Szabolcs

ChemAxon 5433b8e56b

02-06-2010 17:23:47

Hi Dan,


First of all, i have to apology for the very late answer.


The diference between the two option:


-When you use Usg as import parameter, the importer will ungroup all sgroups in the imported structure. This means that, the group information will not be included in the imported structure.


-When you use Xsg as import parameter, the importer will expand all sgroups but will store the sgroup information, and you will be able to handle and see the groups in the imported structure.


In the case of the pyrole ring the problem is that, Marvin stores the implicit hydrogen information in mol file format as a data sgroup - in case when the aromatic form of the molecule is used. (See: Molfiles and compressed molfiles in Marvin) So when you use Xsg then Marvin doesn't touch the group at all because the expanding of the groups is happening after the import happened. But in case of Usg, marvin simply skip the group information in the mol file, and this is the cause of this missbehavioura, because it should read this kind of data sgroup rather then skip it.


We will fix this issue in some of the upcoming releases, until that if you want to import this kind of structures without group information programatically, you should use the ungroupSgroups() method, or some of the ungroupSgroup(params) method on the Molecule object.


We will notice you in this topic when the fix is ready.


Best regards,
Istvan