why can't I import?

User 870ab5b546

11-02-2010 22:14:11

If I copy the XML in the attached .txt document and try to paste it into Marvin, nothing happens.  If I try to paste it into the Source window and import it, I get an error message.  I can't see how the XML in the .txt document differs from the XML in the attached .mrv document, which I can copy and paste.  Any ideas?  The .txt file came to me from a gmail message.

ChemAxon 7c2d26e5cf

15-02-2010 22:43:05

I have compared the two files byte by byte, you can see the differences.
Assignment....txt:


0000000 377 376   <  \0   ?  \0   x  \0   m  \0   l  \0      \0   v  \0
0000020   e  \0   r  \0   s  \0   i  \0   o  \0   n  \0   =  \0   "  \0
0000040   1  \0   .  \0   0  \0   "  \0      \0   ?  \0   >  \0  \r  \0
0000060   <  \0   c  \0   m  \0   l  \0   >  \0  \r  \0   <  \0   M  \0
0000100   D  \0   o  \0   c  \0   u  \0   m  \0   e  \0   n  \0   t  \0
0000120   >  \0  \r  \0 240  \0      \0   <  \0   M  \0   E  \0   F  \0
0000140   l  \0   o  \0   w  \0      \0   i  \0   d  \0   =  \0   "  \0
0000160   o  \0   1  \0   "  \0      \0   a  \0   r  \0   c  \0   A  \0
0000200   n  \0   g  \0   l  \0   e  \0   =  \0   "  \0   1  \0   5  \0
0000220   0  \0   .  \0   0  \0   "  \0      \0   h  \0   e  \0   a  \0

exportedMRV.mrv


0000000   <   ?   x   m   l       v   e   r   s   i   o   n   =   "   1
0000020   .   0   "       ?   >  \n   <   c   m   l   >  \n   <   M   D
0000040   o   c   u   m   e   n   t   >  \n           <   M   E   F   l
0000060   o   w       i   d   =   "   o   1   "       a   r   c   A   n
0000100   g   l   e   =   "   1   5   0   .   0   "       h   e   a   d
0000120   S   k   i   p   =   "   0   .   1   5   "       h   e   a   d
0000140   L   e   n   g   t   h   =   "   0   .   5   "  \n
0000160                               h   e   a   d   W   i   d   t   h
0000200   =   "   0   .   4   "       t   a   i   l   S   k   i   p   =
0000220   "   0   .   2   5   "   >  \n                   <   M   A   t

As I see, in first case, each character are stored on 2 bytes (UTF 16 encoding). In second case, each character (or most of them) is represented by a byte (UTF 8 encoding).


We need further check to investigate that the given file is a valid UTF16 encoded stream. Theoretically, Marvin should import UTF16 encoded files if input encoding have been specified.

User 870ab5b546

15-02-2010 22:58:40

I used TextEdit to save the UTF-16 file in UTF-8 format.  Then I copied the text and tried to paste it into Marvin.  I observed the same behavior.  I get an error in line 5.


But then I pasted it into MS Word 2004 and saved it from there in "Text only" format.  Then Marvin would recognize it.  It's a lot of trouble, though.

ChemAxon 7c2d26e5cf

01-03-2010 16:42:31

I do not know how you created this UTF-16 file exactly but the line ending character is not correct.


Instead of \r\n (Windows) or \n (other) combination, in your file line ending is \r 00A6 where 00A6 is a multibyte character.


When you feed TextEdit with this file (in UTF-16 format), TextEdit may neglected 00A6 character after \r and take a new line instead of that.


Probably, MS-Word removed strange character from  the text and replaced it with \n character. So created a normal text from the original string.


Marvin could not feed the original text because the new line characters have missed from the text and it has caused error in the reading of the stream.

User 870ab5b546

01-03-2010 16:48:55

This file came from an email message sent from a gmail account.  I just copied the text in the email message.  The XML was copied from Edit -> Source and pasted into the gmail message.  


I can't control what email programs my students use.

ChemAxon 5433b8e56b

11-03-2010 14:06:33

Dear Bob,


finally we can find out the root of the reported problem. The fix will be released in the upcoming 5.3.2 release.


The problem was the specific unicode space character, that mess up the imported structure, as Tamas found it, but was a little tricky to fix.


Sorry for the very late reaction, but at least i can give you these good news. Thank you for the report.


Best wishes,
Istvan

User 870ab5b546

11-03-2010 14:16:04

Thanks, that'll make my life a lot easier.