0

Is there a way or a package to guess the type of a file in Python? For example, is it a way to detect a file could be open as ascii, unicode or binary?

Thanks in advance!

Erxin
  • 1,786
  • 4
  • 19
  • 33
  • 1
    [EAFP](http://docs.python.org/2/glossary.html#term-eafp) may be the key here. Just try to open it as ASCII and if it fails, open it as unicode. If it fails (somehow, during your processing), treat it as binary. – Tadeck Sep 13 '13 at 02:07
  • @Tadeck Yes it is a way to solve the example's request but is there a way to return some more detail results such as the mime relative info? – Erxin Sep 13 '13 at 02:10
  • http://stackoverflow.com/q/43580/ http://stackoverflow.com/q/10937350/ – flornquake Sep 13 '13 at 02:13
  • @user2246674 I think is OK when the unicode could be parsed as ascii in the none international programs. A better way may be check the BOM first if the file could be open as text. – Erxin Sep 13 '13 at 02:14
  • @flornquake Thank you, these are the answers I want to know. – Erxin Sep 13 '13 at 02:20

2 Answers2

1

You want the filemagic module.

Eric Urban
  • 3,671
  • 1
  • 18
  • 23
0

If you're on a Unix OS (Linux or Mac), you have access to magic. If on a Mac, you'll likely need to brew install libmagic. There's a Python library called filemagic for rolling it into your Python scripts.

import magic
mage = magic.Magic()
mage.id_buffer("adsfadsf←")

The last line will return 'UTF-8 Unicode text, with no line terminators'

You can also have it check files, which isn't based on the filename but rather the magic bytes at the start of the file:

mage check

Kyle Kelley
  • 13,804
  • 8
  • 49
  • 78
  • Same as `file -b filename.png` in shell. – Santosh Kumar Sep 13 '13 at 03:24
  • @Kyle, Currently I'm working on windows, I have tried the [python-magic](https://github.com/ahupp/python-magic). But it don't work, it will always throw the exception *could not find any magic files*. I will try the lib filemagic right now. – Erxin Sep 13 '13 at 03:39
  • Do you use cygwin at all? This may be overkill anyway. What do you need to know the filetype for? – Kyle Kelley Sep 13 '13 at 03:41
  • @Kyle, I didn't use cygwin, just use pip install the lib and try to import it with pythonwin. I want to write a enhanced search script. – Erxin Sep 13 '13 at 03:45