8

I need to determine MIME-types from files without suffix in python3 and I thought of python-magic as a fitting solution therefor. Unfortunately it does not work as described here: https://github.com/ahupp/python-magic/blob/master/README.md

What happens is this:

>>> import magic
>>> magic.from_file("testdata/test.pdf")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'from_file'

So I had a look at the object, which provides me with the class Magic for which I found documentation here: http://filemagic.readthedocs.org/en/latest/guide.html

I was surprised, that this did not work either:

>>> with magic.Magic() as m:
...     pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> m = magic.Magic()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'ms'
>>> 

I could not find any information about how to use the class Magic anywhere, so I went on doing trial and error, until I figured out, that it accepts instances of LP_magic_set only for ms. Some of them are returned by the module's methods magic.magic_set() and magic_t(). So I tried to instanciate Magic with either of them. When I then call the file() method from the instance, it will always return an empty result and the errlvl() method tells me error no. 22. So how do I use magic anyway?

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Richard Neumann
  • 2,986
  • 2
  • 25
  • 50
  • Do you have a `magic.py` file in the same directory as the one you launched the python shell from? The errors you got make it sound like you do (as I just got all your examples working). One way you can find out is `import inspect` then `inspect.getfile(magic)` and see whether this is the expected file for the `magic` module. – metatoaster Aug 13 '14 at 12:38
  • `>>> import inspect >>> inspect.getfile(magic) '/usr/lib/python3.4/site-packages/magic.py'` – Richard Neumann Aug 13 '14 at 12:46
  • Oh wait, you are referring to Ubuntu's `python-magic`. Yeah that's a completely different package to the one you looked at. Really though you could take a cursory glance at that file returned by `inspect.getfile` and see that it probably completely differs from [the one on GitHub](https://github.com/ahupp/python-magic/blob/master/magic.py). – metatoaster Aug 13 '14 at 12:50
  • Another thing, for future reference, is that from the python shell you can call `help(obj)` to get some kind of help from `obj` via the builtin documentations. So in this case `help(magic)` will also bring up any docstrings and available methods that clearly show that what you got on your system is not the same thing you got the documentation for. – metatoaster Aug 13 '14 at 13:11
  • I already read the module, which was not helpful at all. It's by the way the Arch-Package. – Richard Neumann Aug 13 '14 at 13:19
  • As I said, the distro version of the `python-magic` package is NOT the same as the one you linked on github. They have the exact same name but are completely different. Heck, even the version number on pypi as linked (https://pypi.python.org/pypi/python-magic/) is only at 0.4.6. Also, the distro version is the *bindings* to the magic library, which are NOT the pure native version. The only help you can get is from realizing your mistake here. – metatoaster Aug 13 '14 at 13:25
  • Got it. My only mistake was to not realize, that these are completely different programs. This does, however, not change the problem I describe in my question: How to use that specific library. Not some completely different stuff from github. Thankfully @mhawke found out how. – Richard Neumann Aug 14 '14 at 14:13

1 Answers1

18

I think that you are confusing different implementations of "python-magic"

You appear to have installed python-magic-5.19.1, however, you reference firstly the documentation for python-magic-0.4.6, and secondly filemagic-1.6. I think that you are better off using python-magic-0.4.6 as it is readily available at PYPI and easily installed via pip into virtualenv environments.

Documentation for python-magic-5.19.1 is hard to come by, but I managed to get it to work like this:

>>> import magic
>>> m=magic.open(magic.MAGIC_NONE)
>>> m.load()
0
>>> m.file('/etc/passwd')
'ASCII text'
>>> m.file('/usr/share/cups/data/default.pdf')
'PDF document, version 1.5'

You can also get different magic descriptions, e.g. MIME type:

>>> m=magic.open(magic.MAGIC_MIME)
>>> m.load()
0
>>> m.file('/etc/passwd')
'text/plain; charset=us-ascii'
>>> m.file('/usr/share/cups/data/default.pdf')
'application/pdf; charset=binary'

or for more recent versions of python-magic-5.30

>>> import magic
>>> magic.detect_from_filename('/etc/passwd')
FileMagic(mime_type='text/plain', encoding='us-ascii', name='ASCII text')
>>> magic.detect_from_filename('/etc/passwd').mime_type
'text/plain'
Muayyad Alsadi
  • 1,506
  • 15
  • 23
mhawke
  • 84,695
  • 9
  • 117
  • 138
  • 1
    Thanks. Couldn't find the documentation for bindings to that specifically for Python to help this person who can't seem to understand the difference. – metatoaster Aug 13 '14 at 13:26
  • 1
    I couldn't find any documentation either, just had to read the module source to guess how to use it. – mhawke Aug 13 '14 at 13:27
  • 1
    @MHawke - Thanks for providing a working example with both output formats. Here is a single example from the libmagic repo (or a fork of it?) https://github.com/threatstack/libmagic/blob/master/python/example.py This is only the documentation I could find that presented a workflow of using the module. – Lars Nordin Jun 11 '16 at 11:31
  • 1
    There's a lot of confusion around the two python binding (found a bug report in ubuntu packages by people trying to use the ahupp version of the lib with the standard one.) Anyway, you can get the same result without open and load: magic.detect_from_filename('your_file').mime_type directly provides the expected answer. – Marwan Burelle May 04 '17 at 09:24
  • 1
    @MarwanBurelle: Thanks. This answer refers to `file-5.19`, however, `detect_from_filename()` was added in version `file-5.26`. To be strict, the return values are different with one being a string and the other a `namedtuple`, but your suggestion is certainly easier to use if using `file-5.26` or later. – mhawke May 04 '17 at 11:30
  • The libmagic home page seems to be http://darwinsys.com/file/ and there are links to the official repos there. It appears the python is just a wrapper for libmagic(3), so the man page for libmagic may be helpful. For example in python `magic.open(magic.MAGIC_SYMLINK)` is the same as C API `magic_open(MAGIC_SYMLINK)` and the same as the shell command `file -L`. – Alcamtar Feb 21 '20 at 20:03