0

I want to make a program that checks whether a file (path) contains text that can be "printed."

For example:

interesting = [apple.txt, plane.mov, yeezer.mp3, joker.py]

So apple and joker would be printed, but plane and yeezer would result in an error.

Is there an existing function that I can get to do this or should I just brute force it (check the file type manually to see if it can contain text)?

Edit: I found a solution, just use try/except. If trying open(path, r) is an error then it's not a text file, if it is not then we can print.

  • 1
    Define "can be printed". Any file fundamentally contains bytes which can be printed to the terminal if you like. What distinction are you trying to draw here? – Silvio Mayolo Jul 05 '21 at 23:46
  • You could make a list of file extensions which should be printed, or you could use a library like [filemagic](https://pypi.org/project/filemagic/) to decide which files are printable. There's some ambiguity in your question, though. For example, is an HTML file printable? – Nick ODell Jul 05 '21 at 23:49
  • does this help [How can I detect if a file is binary (non-text) in Python?](https://stackoverflow.com/questions/898669/how-can-i-detect-if-a-file-is-binary-non-text-in-python) – Lei Yang Jul 06 '21 at 01:10
  • @LeiYang Thanks, I just made my own solution, it turned out to be similar to one on that post. – Titan31 Hunter29 Jul 06 '21 at 05:23

1 Answers1

0

Silvio Mayolo made an excellent comment. To extend further, use your definition of what file types are considered "printable" and include these in your regular expression matching for each filename to avoid implementing any sort of print function where unnecessary.