Name causing importMol to hang

User 0261d34ad7

23-05-2012 16:43:20

Hi, 


 


We've hit an issue with ChemAxon name to structure, where the following name causes the import process to hang:


(S)-2-(((((((Glutamyl)aspartyl)asparaginyl)glutamyl)phenylalanyl)leucyl)leucylamino)-6-(3-((4-((3-((2-(2-(2-(2-(2-(2-(2-(2-(2-(2-(4-(1-((3-(((S)-5-carbamoyl-5-(((((((glutamyl)aspartyl)-asparaginyl)glutamyl)phenylalanyl)phenylalanyl)leucylamino)pentyl)carbamoyl)benzyloxy)-imino)ethyl)benzoylamino)ethoxy)ethoxy)ethoxy)ethoxy)ethoxy)ethoxy)ethoxy)ethoxy)-ethoxy)ethyl)carbamoyl)phenoxy)methyl)1,2,3-triazol-1-yl)methyl)benzoylamino)hexanoic amide


The name is more than likely invalid, but it's very hard for us to avoid processing it because we have an automated name processing pipeline. We're importing using the code:


MolImporter.importMol(nameString, "name")


Can you let us know if this is a bug, and if so when it is likely to be fixed? We are using ChemAxon version 5.7.0.


Thanks!,


Jim

ChemAxon e7b9408ca1

24-05-2012 12:10:03

Hi Jim,


Thanks for your report. I could confirm the problem, including in the current version (5.9). The good news is that I could fix it, so 5.10 will not hang in such cases. 5.10 is planned for mid-june. Let me know if an earlier release would be important for you, we could probably do another release of 5.9.x as well.


I trust you will also see improved results from our name to structure when upgrading from 5.7 to .9 or .10. I would of course be happy to get some statistics about it, and/or some more sample data to drive our improvements.


Best regards,


Daniel


PS:5.10 can actually generate a structure for this long name, in about 100ms:


CC(C)C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CCCCNC(=O)C1=CC(CN2C=C(COC3=CC(=CC=C3)C(=O)NCCOCCOCCOCCOCCOCCOCCOCCOCCOCCNC(=O)C3=CC=C(C=C3)C(C)=NOCC3=CC(=CC=C3)C(=O)NCCCC[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC3=CC=CC=C3)NC(=O)[C@H](CC3=CC=CC=C3)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O)C(N)=O)N=N2)=CC=C1)C(N)=O

User 0261d34ad7

28-05-2012 09:46:30

Hi Daniel,


Great, thanks for the good news. We're currently running with 5.7, and our database includes a table of chemistry generated by ChemAxon - if we were to upgrade to 5.10, would that mean we'd have to upgrade the tables? I realize the release notes would probably give us this information but thought a direct question wouldn't hurt.


If we can upgrade to 5.10 without upgrading the database then we'll probably just go with that. Otherwise, a patch would be preferred - ideally to 5.7 though I appreciate that may be impossible.


Any thoughts appreciated,


Jim


 

ChemAxon e7b9408ca1

04-06-2012 09:37:10

Hi Jim,


After checking with the other teams: yes, it seems you will need a table upgrade. Do you expect problems with that?


As long as you are on 5.7, a simple possibility would be to filter out very long names before processing.


Daniel