Problem with molconvert

User 4aada85f0d

25-06-2015 13:03:20

Hello,


I have a problem with document to structure.


If I try and run molconvert on a PDF I get the following (truncated)...


$ molconvert smiles:a test.pdf -o test.smi
Jun 25, 2015 1:53:03 PM chemaxon.naming.document.TesseractProcessOCR isAvailable
WARNING: Tesseract could not be installed, OCR is disabled
java.io.IOException: I'm sorry, I could not find tesseract-unknown-3.01.jar
at chemaxon.marvin.util.InstalledComponent.findLocalJar(InstalledComponent.java:195)
at chemaxon.marvin.util.InstalledComponent.nonAppletInstall(InstalledComponent.java:171)

MarvinBeans: marvinbeans-15.6.15.0-macos


OS: Mac OS X Yosemite (10.10.3).


Java: 1.6.0.jdk


tesseract (via Homebrew): tesseract-3.02.02_3


Is it that this functionality is not supported on OS X? I see 'tesseract-X-3.01.jar' files for Windows and Linux but not Mac.


Thanks,


Francis

ChemAxon e7b9408ca1

25-06-2015 13:47:20

Hi Francis,


This should work on Mac OS. However there might be a few issues to solve to get it working. The first thing is, I suspect you are actually running an older version of marvin. Could you please check what this command says:


molconvert | head -1

User 4aada85f0d

25-06-2015 14:14:45

Oh dear, I *was* an old version (/Applications/MarvinBeans/bin/molconvert) by mistake and not the new version I'd installed.


However, now that's fixed, I have a slightly different problem...


$ whence molconvert


/Applications/ChemAxon/MarvinBeans/bin/molconvert


$ molconvert -h | head -1


Molecule File Converter, version 15.6.15.0, (C) 1999-2015 ChemAxon Ltd.


$ molconvert smiles:a test.pdf -o test.smi


Jun 25, 2015 3:09:55 PM chemaxon.naming.document.TesseractProcessOCR isAvailable


WARNING: Tesseract could not be installed, OCR is disabled


java.io.IOException: Missing resource: /tesseract-macosx-3.01.zip


at chemaxon.util.InstalledComponent.installInto(InstalledComponent.java:224)


I do have an Mac OSX jar now, though...

-rw-r--r--  1 francis  admin  1451901 15 Jun 13:32 /Applications/ChemAxon/MarvinBeans/lib/tesseract-macosx-3.01_1.jar

ChemAxon e7b9408ca1

26-06-2015 06:29:51

Good, we're making progress :) There is indeed an issue with tesseract on Mac OS in our current version. This should be fixed in next week's version. Is this OK for you, or would you need a workaround sooner?

User 4aada85f0d

26-06-2015 08:34:41

Next week would be fine! Any particular release number I should look out for?

ChemAxon e7b9408ca1

26-06-2015 12:00:42

This fix should be included in version 15.06.29.

ChemAxon e7b9408ca1

01-07-2015 06:25:49

Version 15.06.29 is now released. Francis, could you please confirm if it fixes the issue for you?

User 4aada85f0d

01-07-2015 13:54:13

I think that's fixed, yes: I've no problems with any of the PDFs I've tried it on so far. Many thanks!

ChemAxon e7b9408ca1

03-07-2015 06:23:21

Great! Thanks for your report and your patience.