Problems opening xyz-files

User 25d107bd42

23-09-2008 16:06:50

Hi,


a colleague of me has to convert xyz-files to mrv-files and to produce png-pictures of the molecules. This morning he has done the job with a lot of molecules with success. But know in the afternoon is does not work anymore. Even with the same molecules as in the morning.





The question is: Is the number of molecules for academic users limited per day?





Regards, Hans-Ulrich

ChemAxon b124dd5f17

23-09-2008 16:18:50

HI,





Definitely not, the only limitation for academic research users is that they cannot mount Marvin or Plugin functionality on servers. Another limitation -- tho not realted to this is that searching is limited to 3 structures/minute with JChem Base (not Instant JChem).





What else changed apart from time? Also indicate the version -- there were changes in license handling from version 5 JChem and Marvin





Cheers


Alex

User 25d107bd42

23-09-2008 16:49:48

Hi Alex,





thank you for your quick answer. So we have to search for another reason for the problems.


It is version 5.1.1 and the old license handling, and Windows XP.





Cheers, Hans-Ulrich

ChemAxon b124dd5f17

24-09-2008 07:40:43

HI Hans,





When you say "old license handling" do you mean the license key (alphanumeric string)? If so then this will stop being supported from October 1st so if your colleague was playing with his computer dates that could explain.





To fix you need to replace the license key with a license file and instal. You can contact sales or revisit academic package and resubmit, I can send a link to a private forum area where the correct file will be.





Let me know if this helps


Alex

User 25d107bd42

24-09-2008 08:03:26

Hi Alex,





I know the change in licence handling, but I think my colleague has not changed his computer date to a forward time. Why should he do this? It is difficult for me, to analyse his problems: He is 500 km from here and he is using Windows and I am, as you know, a UNIX/Linux user.


But he reported at the telephone a curious workaround: Opening the xyz-file in Word, saving it as txt-file, renaming it to xyz and then opening in MarvinSketch, and then it works ?!? May be it depends on the different line end byte or empty lines at the end of the file. I cannot analyse it now and here.





Cheers, Hans-Ulrich





PS: Now I will order a new license key using the academic page.

ChemAxon 909aee4527

24-09-2008 08:37:58

Hi Hans-Ulrich,





should we help you in analyzing the problem we would ask for the following information:


- problemmatic xyz and mrv files if available (or was it wrong with any file?)


- description of the way of converting to mrv and png





Please post the info in the Marvin forum either with linking this topic or summarizing the problem and the workaround you've found.





Kind regards,


Judit

User 25d107bd42

24-09-2008 09:08:42

Hi Judit,





today it's not possible for me to get this information we need. I hope my colleague can send me the problematic files and I could evaluate the problem. When I get more information, I will post it in a new topic in the Marvin forum.





Thanks for the intensive discussion,





Kind regards, Hans-Ulrich





P.S. Sep 25, 2008 11:00 am: My colleague has deleted the problematic xyz-files. When he has new problems, he will send those files to me.

User 25d107bd42

08-10-2008 15:11:46

Hi Judit,


it seems, I have found one reason for the problems my colleague had experienced. I got from him the attached xyz-file PYRR8T-3.xyz. Opening this file with MarvinSketch produces the open window screenshot bf0111 and you see the information there are 2 molecules ? And it follows the sentence: "Please enter molecule number to load (1-2)".


Giving 1 opens the expected molecule in MarvinSketch. Giving 2 produces "Cannot read molecule 2", which is OK because there is no molecule 2 in this xyz-file.





I edited the xyz-file PYRR8T-3.xyz using an ASCII-text editor and found 3 empty lines at the end. Deleting 2 of these 3 lines gives PYRR8T-1.xyz and now MarvinSketch finds only 1 molecule and the information in the open window shows only 1 molecule and the window "Please enter molecule number to load (1-2)" is not shown.





So the problem is only irritating the user and one can easily "workaround".





Regards, Hans-Ulrich





PS 1: I suggest to transfer this topic totally to the forum "Structure editing, viewing and file formats".





PS 2 (added Oct 9, 2008):


An additonal test showed that having 2 empty lines at the end produces although the sentence "Please enter molecule number to load (1-2)". So a xyz-file should have only one empty line or Marvin must recognize empty lines correctly.

ChemAxon 909aee4527

08-10-2008 16:00:15

Hello Hans-Ulrich,





the topic has been moved.





Thank you for the details of the problem.





There are certain file extensions allowed, for example the most frequent image and structure files. Please tell me what extension would you like to upload, and we solve it.





Kind regards,


Judit

User 25d107bd42

08-10-2008 18:50:34

Hi Judit,





does this also concern the import of cif-files ?





Kind regards, Hans-Ulrich

ChemAxon 909aee4527

09-10-2008 07:37:17

Sorry, I meant there are certain file extensions allowed to upload to the forum replying to your original PS 2, which is now removed as I see.





If there is something you cannot attach to the forum post, write us the required extension and we will set it to be allowed.

User 25d107bd42

09-10-2008 07:49:41

Hi Judit,


evaluating further the handling of xyz-files I produced the file butane.xyz (attached) containing the three conformers of n-butane (trans and 2 gauche). The screenshot shows the upper part of this xyz-file. This corresponds to the format which is most often used:





Line 1: n = number of atoms


Line 2: Titel, comments and other text, in the example here: empty.


Line 3 to line n+2: Atomsymbol and xyz-Koordiantes





In the following line begins the next molecule with the number of atoms. And so on.





In previous examples I had totally different molecules in such a file, no problem.





But I have one desire: Is it possible to write in the second line the name of the file and the energy of the conformer when you are "Storing conformation information in property field" ? In the mrv- and the sdf-format the energy is saved for the individual conformers. To put this energy in the title line would be a good feature, especially when you are using this format in other programs as input (As I do for quantum chemical programs). Both information is present in the MarvinSketch tool program at the time you are saving the conformers.





Kind regards, Hans-Ulrich

User ef5e605ae6

09-10-2008 11:52:38

Hi Hans-Ulrich,
Quote:
Is it possible to write in the second line the name of the file
I suppose you mean the name of the molecule. (Molecule.getName())
Quote:
and the energy of the conformer when you are "Storing conformation information in property field" ?
Maybe, but in what format?


"molecule_name\tEnergy=..."


?





regards,


Peter

User ef5e605ae6

09-10-2008 12:35:45

OK, it will work in 5.1.3. The format is "molname\tenergy_value", as in babel.


Peter

User 25d107bd42

09-10-2008 15:58:31

Hi Peter, that's very good.





In the sdf- and the mrv-output there are also the numbers of the conformers.





May I suggest another setting, too?





In another topic of me I argued to implement the international standards units:


http://www.chemaxon.com/forum/ftopic3497.html


and you programmed the option to display the energies either in kcal/mol or kJ/mol which is very good for users which are "thinking" in one or the other dimension.





But in the files in sdf-format and mrv-format there are still only the energies in kcal/mol. Would it be possible to set this output to the dimension setting of the MarvinSketch GUI ?





Regards, Hans-Ulrich

User ef5e605ae6

14-10-2008 05:50:22

Hi Hans-Ulrich,


We discussed your new suggestion. Theoretically, it would be better to store the energy as a number instead of a string property (as it is currently) and together with the unit information. Then the unit could be stored as a CML attribute in MRV and also in a weird, Marvin-specific way in SDF. Even if we store the energy as a number (double), without units, then the SDF output contains


> <Energy>


MProp:scalar:double:48.91518542316838


You could import it correctly with the chemaxon tools only.





Summarizing, it seems that practically, it would not be useful improve the energy storage. I know you only want the unit to be optional in export, but it is not enough. The import must also know the units, otherwise your x kJ/mol would be imported as x kcal/mol (unless you would also specify an additional import option but you would certainly forget it). It seems there is no acceptable solution, but even if there is, it is much more work than it's worth.





regards,


Peter

User 25d107bd42

14-10-2008 12:16:51

Hi Peter,


OK. I agree to your last sentence "... it is much more work than it's worth." The conversion from kcal/mol to kJ/mol could be done easier in the program reading the Marvin output.





In the attached screenshot there are three parts of the sdf-file for the three conformers of n-butane. In the title line beginning with "Marvin" there are obviously the date and time of the calculation and the saving and with 5 digits the energy. Looks fine. In the corresponding mrv-files you can also find a molecule number "m1, m2, m3" and so on if there are more conformers.


I think it would be easy to put this numbering also in the xyz-output of Marvin, as I haved added in red.





Regards, Hans-Ulrich





BTW: What is the reason for double precision energy numbers and also for coordinates and other float values ? I think single precision must be enough, as you can see in the energy values for n-butane. And the last two conformers must have the same energy, they are enantiomers. And in the title line it is OK.

User ef5e605ae6

14-10-2008 13:29:46

Hi Hans-Ulrich,





I see a contradiction in your next request. You have an SDF screenshot but write about xyz. Assuming you need this feature in SDF, my answer is no, characters 22-34 are reserved for "scaling factors" in the molfile format. If you want this feature in XYZ, then I also resist making incompatible changes. Could you point me to an XYZ documentation describing such a feature? On the other hand, this feature seems to be superfluous for me, since the m1, m2 etc. contains only the number of the molecule. You can also get this number by simple counting: the n-th exported molecule is "mn". It is stored in MRV only because the file may contain internal references.





I could ask the opposite: what would be the reason to use single precision if computers use double precision arithmetic since the 70's? We do not care that the 6th digit is physically unimportant because there are other considerations and the SDF's internal contents are not formatted for direct viewing anyway.


Internally, numbers are stored in double precision in the code because it is the simplest way. The problems with 5-digit precision saving would be that


1. information would be lost in truncation, export -> import -> export could produce a different file. You might not consider it to be a problem but it is sometimes.


2. energy storage would become more complicated for the clean3d guys, they would have to use some tricky String.format("%.5f") conversion which is not even available in java < 1.5. It is much more simple for them to use String.valueOf(double). Since energy is stored by them in many places of their code, hard coding that conversion would be a bad idea anyway...





If you want to display the result in 5-digit precision, then you should convert it to a 5-digit number after reading the SDF.





regards,


Peter

User 25d107bd42

14-10-2008 14:35:42

Hi Peter,


to the main point in this topic: Of course I would like to have some information in the xyz- output of MarvinSketch (My mentioning of the sdf-line was only a little further idea, because in the mrv-files there are m1, m2 ...).





So it would be fine to have an xyz-output with a title line as in the screenshot attached here. I have programs written by myself which use information given in the title line of the xyz-file, f.e. method setting, parameter setting and others.





I don't know a reference where the xyz-format is defined. Do you know a reference for it? From several programs I get different xyz-files, either with element numbers or with element symbols. The sequence of the lines atomnumber - title line is also sometimes different. So I have programmed my FORTRAN-program to recognise the different xyz-files as input. And this is not so easy to do in a language which is proposed to do number crunching. The conversion program babel produces nearly the same format as MarvinSketch.





To the second point: I cannot totally agree to your argumentation, but I think this point is not so important to have a long discussion and it may be more practicable to use always double precision.





Regards, Hans-Ulrich

User ef5e605ae6

15-10-2008 06:46:23

Hi Hans-Ulrich,





Regarding the not so important point, I realized that energy storage in the molfile header was a quite incomplete work by someone. Now I completed it: MolImport reads energy from the header and the additional SDF field (with the double precision number) is removed, it was superfluous.





About XYZ, I also do not know about any documentation, hence I would not dare to change the title line even if I would consider your proposal to be useful. According to babel, the first column must be the molecule name, the second column must be the energy. There is no sense in replacing them with an incorrect MDL molfile header (with additional superfluous information like m1, m2), it would only be a source of incompatibility.





regards,


Peter

User 25d107bd42

15-10-2008 08:12:33

Hi Peter,


OK. No m1, m2... it's redundant.





I have tested the conversion from sdf / mdl to xyz using the super program babel, see the screenshot.


Obviously babel has problems with the title line of the sdf-file.





Regards, Hans-Ulrich

ChemAxon 8b644e6bf4

15-10-2008 13:52:22

Dear Hans-Ulrich,
Quote:
To the main point in this topic: Of course I would like to have some information in the xyz- output of MarvinSketch
In MarvinSketch, under File/Document settings you can edit the "title" field which will be appear in the second line of an exported xyz file.





regards,


Gabor

User 25d107bd42

15-10-2008 15:56:55

Dear Gabor,


thank you for your tip "File/Document settings". It works and produces a title line with all the information we want.





Exporting from Marvin to sdf also produces this title line in the sdf-file and converting it with


babel to xyz transfers the title line exactly, in spite of the WARNING-output of babel.


Something in the sdf/mdl-output of MarvinSketch is wrong. But it works and so we can handle it.





Kind regards, Hans-Ulrich

User ef5e605ae6

15-10-2008 18:26:32

Dear Hans-Ulrich,





I'm afraid you use an old, buggy babel version from the past millenium.


No error, no warning, it works perfectly:





trillian~/chemaxon/marvin% babel butane1.sdf butane1x.xyz


1 molecule converted


9 audit log messages


trillian~/chemaxon/marvin% head butane1x.xyz


14


Butane m1


C 2.65980 1.20820 -2.09430


C 1.30580 0.83610 -1.44930


C 1.46910 0.06440 -0.11220


C 0.11580 -0.30770 0.53300


H 3.23680 1.84710 -1.42230


H 2.49130 1.74770 -3.02910


H 3.23650 0.30650 -2.31150


H 0.74150 0.22260 -2.15670





But it was at least an hour for me to find it out. Could you please stick to *real* bug reports and *generally* useful feature requests? We have limited amount of time for the forum, coding would be more fruitful.





best regards,


Peter

User 25d107bd42

15-10-2008 19:10:12

Dear Peter,





please excuse me for posting an incorrect bug report.


babel -version gives: Open Babel 2.1.1 -- Mar 7 2008 -- 14:06:20,


and I could not see that it is a bug in the babel program.





Best regards, Hans-Ulrich





PS: There were some "*real* bug reports and *generally* useful feature requests"


I have posted in the last months, f.e. in the field of huckel calculations.


But may be, it's enough now. Sorry once more if I have taken your time.

User 25d107bd42

17-10-2008 16:45:21

Hi Peter,





the warning message of my babel program is erroneus. The warning comes when this program version finds a $$$$-line.


But this $$$$-line is per sdf-definition the end-line of a molecule and it is the separator of more than one molecules.


I looked in the documentation http://www.mdl.com/downloads/public/ctfile/ctfile.pdf





But it is only a warning and the output of my babel version is OK.





I will send this bug to the Open babel developers.





Best regards, Hans-Ulrich

User ef5e605ae6

18-10-2008 15:39:24

Hi Hans-Ulrich,





Thanks for finding this out! However, I think they already fixed it in babel 2.2.0, that's why I could not reproduce the warning.





best regards,


Peter