How to call type safely on a random file in Python?

Question

So I am attempting to call the Windows command type on some arbitrary file. Unfortunately, whenever I convert my cmd into from a shell command into a non-shell one it fails. As such, I cannot use the recommended method for ensuring that my python script cannot be exploited. Here is an example.

import subprocess
cmd = "type" + '"' + "some_file_with_no_spaces_or_other_things_wrong" + '"'
p = subprocess.pOpen(cmd, shell = True)

but when I try:

#Assume cmd split is done properly. Even when I manually put in the 
#array with properly escaped quotes it does not work
subprocess.pOpen(cmd.split(), shell = False)

It fails, and I don't know how to solve this issue. I would like to be able to call this command securely by having the shell by false, but whenever I do so I get the following error.

Traceback (most recent call last):
  File "C:\Users\Skylion\git\LVDOWin\Bin\Osiris.py", line 72, in openFileDialog
    stderr = STDOUT, shell = False, bufsize = 0, universal_newlines=True)
  File "C:\Python34\lib\subprocess.py", line 859, in __init__
    restore_signals, start_new_session)
  File "C:\Python34\lib\subprocess.py", line 1112, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Note that even running: subprocess.Popen(['type']) Will throw the error. My issue is how do I sanitize the filename so I can either run the filename with shell=True, or get shell=False properly working.

Any help on how to properly open a file in this way would be greatly appreciated.

When there is I do proper quotations around the file name first. The issue here lies with type not with the arguments of type as providing and non-existent file just has it spit things out to stderr, not crash Python like it is doing. — Skylion, Aug 05 '15 at 20:19

score 1 · Accepted Answer · edited May 23 '17 at 12:29

type is an internal command and therefore you need to run cmd.exe e.g., implicitly via shell=True.

If you pass the command as a list on Windows then subprocess.list2cmdline() is called to convert the list into a string to pass to CreateProcess() Windows API. Its syntax is different from cmd.exe syntax. For details, read the links in this answer.

Pass the shell command as a string and add shell=True:

from subprocess import check_call

check_call(r'type "C:\path\with spaces & special symbols.txt"', shell=True)

Note: r'' prefix is used to avoid escaping the backslahes in a literal string.

If the command works as is from the command line then it should work from Python too.

If the filename is given in a variable then you could escape it for the shell cmd using ^:

escaped_filename = filename_with_possible_shell_meta_chars.replace("", "^")[:-1]
check_call('type ' + escaped_filename, shell=True)

Note: no explicit quotes.

Obviously, you could emulate the type command in pure Python:

TYPE copies to the console device (or elsewhere if redirected). No check is made that the file is readable text.

If all you need is to read a file; use open() function:

with open(r'C:\path\with spaces & special symbols.txt', 
          encoding=character_encoding) as file:
    text = file.read()

If you won't specify the explicit encoding then open() uses ANSI code page such as 'cp1252' (locale.getpreferredencoding(False)) to decode file's content into Unicode text.

Note: you have to take into account 4 character encodings here:

the character encoding of the text file itself. It can be anything e.g., utf-8
ANSI code page used by GUI applications such as notepad.exe e.g., cp1252 or cp1251
OEM code page used by cmd.exe e.g., cp437 or cp866. They can be used for the output of the type command when it is redirected
utf-16 used by Unicode API such as WriteConsoleW() e.g., when cmd /U switch is used. Note: Windows console displays UCS-2 i.e., only BMP Unicode characters are supported but the copy-paste works even for astral characters such as 😊 (U+1F60A).

See Keep your eye on the code page.

To print Unicode to Windows console, see What's the deal with Python 3.4, Unicode, different languages and Windows?

Wow, this one of the most thorough, concise answers I have ever gotten on StackOverflow. Thank you very much! — Skylion, Aug 06 '15 at 03:11
Out of curiosity, do you know why specifying it as cp437 or cp866 is so much slower than calling type on commandline? Do I need to maybe break it up into an stdout stream? Any ideas how I can optimize for it large files? Otherwise, this is answer is absolutely superb. — Skylion, Aug 26 '15 at 18:30
@Skylion: I don't know but a profiler might: `py -m profiler your_script.py` (look what functions take the most time). Compared to the printing to Windows console everything else should be fast. Create two [minimal code examples](http://stackoverflow.com/help/mcve): one calls `type`, another emulates it in Python. Make sure they produce the same result. Describe what you expect to happen and what happens instead (how long it takes, how fast would you like it to be). Post it as a separate question. — jfs, Aug 26 '15 at 23:46

How to call type safely on a random file in Python?

1 Answers1

Linked