Standardizer error with R-groups

User 677b9c22ff

27-08-2007 23:06:59

Hi,


the Standardizer 3.2.9 fails on the following structure, however only if it is the last


function in the Standardizer XML file. If it is the first function it walks through all the molecules without any problem. Source is still the Biometa KEGG DB.





This code fails (remove fragment is last):






Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->


<!-- This configuration file is created with ChemAxon Config Builder -->





<StandardizerConfiguration Version ="0.1">


   <Actions>


      <Dearomatize ID="dearomatize"/>


      <Transformation ID="Transform - remove countions and charges" Structure="O=C[O-:1].[Na,K;+:2]>>O=C[O:1]" Type="string"/>


      <Transformation ID="Transform NITRO" Structure="[O-:1][N+:2]>> [O:1]=[N:2],[NH1+:1][O-:2]>> [H:3][O:2][N:1]" Type="string"/>


      <Transformation ID="Transform carboxylate to carboxyl" Structure="[O-:3][C;X3:1]=[O:2]>>[H][O:3][C;X3:1]=[O:2]" Type="string"/>


      <Removal ID="removal" Method="keepLargest" Measure="keepLargest"/>


   </Actions>


</StandardizerConfiguration>






This code is ok (remove fragment is first):


However the problem is it does not remove all water, because of some other steps. Therefore the remove fragment option should be last, but then it will fail.


Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->


<!-- This configuration file is created with ChemAxon Config Builder -->





<StandardizerConfiguration Version ="0.1">


   <Actions>


      <Removal ID="removal" Method="keepLargest" Measure="keepLargest"/>


      <Dearomatize ID="dearomatize"/>


      <Transformation ID="Transform - remove countions and charges" Structure="O=C[O-:1].[Na,K;+:2]>>O=C[O:1]" Type="string"/>


      <Transformation ID="Transform NITRO" Structure="[O-:1][N+:2]>> [O:1]=[N:2],[NH1+:1][O-:2]>> [H:3][O:2][N:1]" Type="string"/>


      <Transformation ID="Transform carboxylate to carboxyl" Structure="[O-:3][C;X3:1]=[O:2]>>[H][O:3][C;X3:1]=[O:2]" Type="string"/>


   </Actions>


</StandardizerConfiguration>









SMILES:


O.O.O.O.O.O.O.[Na+].[Na+].[Na+].[Na+].[H]C12SCC(CSC3=NC(=O)C(=O)[N-]N3C)=C(N1C(=O)C2NC(=O)C(=NOC)C4=CSC(N)=N4)C([O-])=O.[H]C56SCC(CSC7=NC(=O)C(=O)[N-]N7C)=C(N5C(=O)C6NC(=O)C(=NOC)C8=CSC(N)=N8)C([O-])=O





-----------------------


Standardizer failed


83





chemaxon.marvin.modules.Parity.getParity(Parity.java:292)


chemaxon.marvin.modules.Parity.modfunc(Parity.java:137)


chemaxon.struc.MoleculeGraph.getParity(MoleculeGraph.java:1812)


chemaxon.reaction.ReactionPerformer.setParities(ReactionPerformer.java:1840)


chemaxon.reaction.ReactionPerformer.reactHit(ReactionPerformer.java:921)


chemaxon.reaction.ReactionPerformer.reactOne(ReactionPerformer.java:823)


chemaxon.reaction.ReactionPerformer.reactBase(ReactionPerformer.java:793)


chemaxon.reaction.ReactionPerformer.react(ReactionPerformer.java:745)


chemaxon.reaction.Standardizer.processReaction(Standardizer.java:1646)


chemaxon.reaction.Standardizer.performReaction(Standardizer.java:1600)


chemaxon.reaction.Standardizer.standardizeComponent(Standardizer.java:1760)


chemaxon.reaction.Standardizer.standardize(Standardizer.java:1826)


chemaxon.alchemist.standardizer.StandardizerAlchemistTask.calculate(StandardizerAlchemistTask.java:155)


chemaxon.alchemist.AlchemistTask$ActualTask.<init>(AlchemistTask.java:200)


chemaxon.alchemist.AlchemistTask$3.construct(AlchemistTask.java:97)


chemaxon.alchemist.utils.SwingWorker$2.run(SwingWorker.java:107)


java.lang.Thread.run(Unknown Source)





---------


Removing the R-groups first would help (but does not fix the error):





Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->


<!-- This configuration file is created with ChemAxon Config Builder -->





<StandardizerConfiguration Version ="0.1">


   <Actions>


      <Removal ID="removal" Method="keepLargest" Measure="keepLargest"/>


      <Removal ID="removal" Method="rgroups" Measure="rgroups"/>


      <Dearomatize ID="dearomatize"/>


      <Transformation ID="Transform - remove countions and charges" Structure="O=C[O-:1].[Na,K;+:2]>>O=C[O:1]" Type="string"/>


      <Transformation ID="Transform NITRO" Structure="[O-:1][N+:2]>> [O:1]=[N:2],[NH1+:1][O-:2]>> [H:3][O:2][N:1]" Type="string"/>


      <Transformation ID="Transform carboxylate to carboxyl" Structure="[O-:3][C;X3:1]=[O:2]>>[H][O:3][C;X3:1]=[O:2]" Type="string"/>


      <Removal ID="removal" Method="removeSmallest" Measure="removeSmallest"/>


   </Actions>


</StandardizerConfiguration>

ChemAxon d76e6e95eb

29-08-2007 11:58:37

The 11th molecule in the Kegg database contains an undefined R-atom.


The error message should have been better.





We will check what to do with such structures.

ChemAxon e08c317633

18-03-2008 15:44:30

Hi Tobias,





Biometa-KEGG DB is available for download only for academic users (see http://cheminf.cmbi.ru.nl/biometa/flat/). Can you attach a file that we can use to reproduce this error?





Regards,


Zsolt

User 677b9c22ff

18-03-2008 23:05:49

Zsolt wrote:
Hi Tobias,





Biometa-KEGG DB is available for download only for academic users (see http://cheminf.cmbi.ru.nl/biometa/flat/). Can you attach a file that we can use to reproduce this error?





Regards,


Zsolt
Hi Zsolt,


yes that license thing is true and I can not do anything about it.


BTW you can access KEGG as end user, thats no problem.


Some of the structures are false by design.





LIOTRIX has some weird SDF design, I think its wrong.


http://www.genome.jp/dbget-bin/www_bget?-f+m+drug+D00361





The problem are the secondary SD file constructs


like M CHG, M STY,M SLB, M SAL.





What I would do in case of Standardizer, because this should


be a "hardened product" I would include an option to ultimatively


ignore errors (make a checkbox available in the GUI) and go on


with the next molecule
.





I am insofar thankful to Tim because they implemented exactly


that by ignoring wrong structures in Instant-JChem and even


export illegal structures in a second file. So I curated my


structures with Instant-JChem and those who were OK I sent


to the Standardizer and other programs.








Kind regards


Tobias








What I now get with Standardizer version 3.2.9 and MVIEW


from 3.2.9 is the following error:





Bad multiple group subscript, not an integer





chemaxon.struc.sgroup.MultipleSgroup.setSubscript(MultipleSgroup.java:372)


chemaxon.marvin.modules.MolImport.setSgroupSubscript(MolImport.java:2088)


chemaxon.marvin.modules.MolImport.readPropertiesBlockV2(MolImport.java:1500)


chemaxon.marvin.modules.MolImport.readCtab(MolImport.java:992)


chemaxon.marvin.modules.MolImport.readMol0(MolImport.java:681)


chemaxon.marvin.modules.MolImport.readMol(MolImport.java:265)


chemaxon.formats.MolImporter.readMol(MolImporter.java:776)


chemaxon.formats.MolImporter.read(MolImporter.java:608)


chemaxon.formats.MolImporter.read(MolImporter.java:574)


chemaxon.alchemist.standardizer.StandardizerAlchemistTask.calculate(StandardizerAlchemistTask.java:152)


chemaxon.alchemist.AlchemistTask$ActualTask.<init>(AlchemistTask.java:200)


chemaxon.alchemist.AlchemistTask$3.construct(AlchemistTask.java:97)


chemaxon.alchemist.utils.SwingWorker$2.run(SwingWorker.java:107)


java.lang.Thread.run(Unknown Source)





for mview


Exception in thread "Thread-130" java.lang.IllegalArgumentException: Bad multiple group subscript, not an integer


at chemaxon.struc.sgroup.MultipleSgroup.setSubscript(MultipleSgroup.java:372)


at chemaxon.marvin.modules.MolImport.setSgroupSubscript(MolImport.java:2088)


at chemaxon.marvin.modules.MolImport.readPropertiesBlockV2(MolImport.java:1500)


at chemaxon.marvin.modules.MolImport.readCtab(MolImport.java:992)


at chemaxon.marvin.modules.MolImport.readMol0(MolImport.java:681)


at chemaxon.marvin.modules.MolImport.readMol(MolImport.java:265)


at chemaxon.formats.MolImporter.readDoc(MolImporter.java:697)


at chemaxon.formats.MolImporter.nextDoc(MolImporter.java:622)


at chemaxon.marvin.view.MDocStorage.readDoc(MDocStorage.java:2289)


at chemaxon.marvin.view.MDocStorage.getMainDoc(MDocStorage.java:819)


at chemaxon.marvin.view.swing.modules.GridBagView.getDocument(GridBagView.java:601)


at chemaxon.marvin.view.swing.modules.GridBagView.setVisibleCanvas(GridBagView.java:1028)


at chemaxon.marvin.view.swing.ViewPanel.setVisibleCanvas(ViewPanel.java:1817)


at chemaxon.marvin.view.swing.modules.GridBagView.setVisibleCell(GridBagView.java:1467)


at chemaxon.marvin.view.swing.modules.GridBagView.visibleCells(GridBagView.java:1668)


at chemaxon.marvin.view.swing.modules.GridBagView.update(GridBagView.java:3194)


at chemaxon.marvin.view.swing.modules.GridBagView.access$800(GridBagView.java:45)


at chemaxon.marvin.view.swing.modules.GridBagView$8.run(GridBagView.java:3162)


at chemaxon.marvin.util.ThreadSerializer$1.run(ThreadSerializer.java:97)

ChemAxon d76e6e95eb

20-03-2008 17:07:44

The Standardizer wizard includes the option to ignore invalid molecules. Do you think, that this feature can solve your current problem?

User 677b9c22ff

21-03-2008 01:26:17

Hi Gyuri,


thats nice. Thank you.


Tobias