Problem with convertion from mol2 to sdf

User f7c6611bf1

12-11-2012 15:27:21

I tried to convert some charged molecules from mol2 format to sdf format using MarvinView 5.11.3 but the the structures produced are not correct.


See in attachment one example where a positive charged molecule is converted to a negative charged molecule. I also included the correct structure produced by Babel.

ChemAxon d51151248d

14-11-2012 09:38:14

Dear jlopes !


I looked at your output files you attached and find the situation a bit confusing : the input and output files of the conversion both contains a molecule with a negatively charged N atom, while the Babel-converted output file contains a positively charged molecule. So it seems to me that Marvin converted the file correctly.


Please if you still have this problem, check your output .mol2 files : which program output which .mol2 file.


We are looking forward to hearing from you if you still have the bug.


 


Daniel


 

User f7c6611bf1

15-11-2012 19:31:19

Dear Daniel,


thank you for the response, but I think the problem is more complex. If you sum up the charges over all atoms you can see that the whole molecule is positively charged. The fact that one of the atoms has negative calculated partial charge does not imply that something is going wrong. There are so many factors that will affect the charge distribuition over the atoms.


It seems to me that the Marvin (and molconvert as well) converts mol2 format to sdf using partial charges to produce the atoms formal charges, not using the connectivities at all. However the sdf format must reflect the formal charges based on atoms connectivity. This is not the case and compromises the consistency of the structures with strong deleterious effects on calculations performed over them.


See in the attachment another example downloaded from ZINC. The ZINC sdf file shows clearly that molecule is neutral. In the same way, the ZINC mol2 file has no net total charge, despite the fact that atoms have positive or negative partial charge. The Marvin sdf file generated from ZINC mol2 file shows a molecule with three negative charges. Unfortunately, I think there is a bug here!


Best regards,


Julio.

ChemAxon 5433b8e56b

18-11-2012 22:47:06

Hi Julio,


after checking you second examples, I am forwarding this issue to the Molecule representation and file formats forum, my colleagues will examine what can happen here, and answer for you soon.


Thank you for your report and for your patience in the mean time.


Regards,
Istvan

ChemAxon 044c6721bc

19-11-2012 13:50:32

Hi Julio,


Thanks for reporting this problem.


Bacause we don't know the exact algorithm which is used by the Open Babel, we use an intuitive approximation. If the partial charge is grater then a X value, we set charge to the atom. We know it isn't the best solution, but now it isn't a high priority task. If you have any idea for the solution, we can implement it.


Best,


Janos

User f7c6611bf1

26-11-2012 12:14:08

Dear Janos,


I understand that is not a high priority task, however it is not a matter of which is the best solution. Not only the conversion is incorrect but even the structure perception within Marvin and other Chemaxon tools is not correct when the structures are in mol2 format, and this will reflect in the calculation quality of several properties.


I agree that is not common task to convert mol2 format to sdf (the contrary is more common), however some calculations performed by Chemaxon products over structures in mo2 format have the same problem. For intance, if you upload a mol2 file in MarvinView, depending on partial charges, some of the calculations available under Tools menu will not function (if you try to calculate charges or hydrogen bond donor/acceptor you got the error message "Inconsistent molecular structure"), even if you do not use the format conversion. The conversion is made automatically through structure perception.


I note the problem when I tried to calculated the pharmacophores using Pmapper (with Pharma-calc.xml configuration file) and went to check the results with MarvinView. In the example I sent before the results I got with both original mol2 or the sdf produced by Marvin is the same, and they are different from those I got when the original sdf is used. And no error is reported. See below the PMAP vectors for the file downloaded from ZINC:


zinc_2640583.sdf


r;r;r;r;r;r;h;a/d;r;r;r;r;r;r;a/r;r;a/r;;;;;;;;


zinc_2640583.mol2


r;r;r;r;r;r;h;a/d;r;r;r;r;r;r;d/r;r;d/r;;;;;;;;


zinc_2640583.mol2.babel.sdf


r;r;r;r;r;r;h;a/d;r;r;r;r;r;r;a/r;r;a/r;;;;;;;;


As you can note, the calculations performed over ZINC sdf file or after mol2-to-sdf babel conversion produce the same results, which are different from those produced by original mol2 file. The hydrogen donor/aceptor perception differ considerably. Note that no negative charge atom appear.


And more, the results you got could be very different depending on the partial charge calculations. After an in house protocol that I used for conformation and partial charge calculations the same compound zinc_2640583 are perceived by Marvin as a molecule having 3 negative charges and two positive charges and Pmapper produced the following PMAP vector:


ZINC02640583.mol2


r;r;r;r;r;r;+/r;r;r;r;+/r;r;-/a/r;-/a/r;-/a;r;h;;;;;;;;


Please, take this criticism as a contribution to Chemaxon improve its very fine and usefull tools. I am a heavy user of the Chemaxon softwares and I appreciate them very much. Once the problem was detected in my lab I tried to solve it as soon as possible, and the solution was to use Babel to do the previous conversion. However if the users not know the problem it remains causing errors.


As you ask for suggestions, I think you could try a few approaches, one is to perform a net charge calculation (just summing up all partial charges) and compare with total of the formal charges produced. In  my opinion, If they are divergent  is better to produce no results than produce a wrong structure. Another approach is to use the atoms connectivities along with bond order available at the end of mol2 file, so it is enogh to check the number of neighbor atoms and compare with the tipical valence of the atom. In some cases certain problems can arise, for instance for detection of carbocations and carbanions ions.


However I think that Chemaxon already has the perfect tools for the job. The Structure Checker could be used to fix almost all of those problems. Maybe you could use some of the subroutines of it to automatically perceive and fix the problems.


See in attachment the results I had when I submitted two mol2 file from the same ZINC compound to Structure Checker (one is the original ZINC mol2 file and another is the file I generate with my in house protocol). I used the Valence Error Checker (Fix->FixValence) and Molecule Charge Checker (Fix->Neutralize) with their default options. Under Set Options I selected Fix option ("Solve problems with fixer configured for each checker. Users are prompted to resolve any unfixable or conflicting problems."). As output file the Structure Checker produced the sdf file I send in attachment where both structures are correct. All problems where solved without the need of manual user intervention.


I hope you find these suggestions helpful.


Best regards,


Julio.

ChemAxon 044c6721bc

28-11-2012 09:51:58

Hi Julio,


Thanks for the suggestions, we'll try to find some time to work on it. I will inform you if something happens with this task.


Best,


Janos