There is a number of minor problems with your code, but on the whole, yes, this is how you run a subprocess.
- Prefer
subprocess.run
over bare Popen
when you can. This would also avoid the behavior you regard as confusing; see the next bullet point.
- As such, if you do run
Popen
and then communicate
, the output from that is a tuple with two values, the standard output and the standard error of the finished process.
- But you are redirecting standard output to a file, so of course
stdout=subprocess.PIPE
is going to be unnecessary and produce nothing. (Because you don't capture stderr
with stderr=subprocess.PIPE
, it ends up containing None
; if there is any error output, it is simply displayed to the user, and invisible to Python.)
- Your shell script is horribly overcomplicated. Reducing it to a single process would avoid the need for
shell=True
which is generally something you should strive for.
- But even more fundamentally, the script could be reimplemented in native Python, which would make it both more versatile, as well as easier to understand for anyone who is not familiar with both shell script and Python. (Granted, the shell formulation would be a lot more succinct, at least after refactoring.)
The obvious Python implementation would look like
from pathlib import Path
...
with open("sample.txt", "r") as lines, \
Path("~/out.txt").expanduser().open("w") as output:
for line in lines:
if "patternA" in line:
output.write(line.replace('foo', 'bar'))
where obviously we have to guess wildly at what your sed
script actually does, as you have replaced it with a placeholder.
The same with subprocess.run
and avoiding the shell programming antipatterns,
from pathlib import Path
...
with Path("~/out.txt").expanduser.open("w") as output:
subprocess.run(
['sed', '/patternA/something', 'sample.txt'],
stdout=output, text=True, check=True)
You want to avoid the [useless cat
](useless use of cat
and the useless grep
; and with those out of the way, you don't need a pipeline, and thus no shell.
If you want to retrieve status information from the subprocess, assign the result from subprocess.run
to a variable you can examine, say r
; the error status will be in r.resultcode
(though with check=True
it's guaranteed to be 0).
Python won't let you mix capture_output=True
with stdout=...
and/or stderr=...
so if you want to see whether there is error output (there could be a warning message from some tools even when they succeed) you have to split the operation. Perhaps like this:
import logging
from pathlib import Path
...
r = subprocess.run(
['sed', '/patternA/something', 'sample.txt'],
capture_output=True, text=True, check=True)
with Path("~/out.txt").expanduser().open("w") as output:
output.write(r.stdout)
if r.stderr:
logging.warn(r.stderr)
As a final aside, os.path.expanduser()
or pathlib.Path.expanduser()
are necessary to resolve ~/out.txt
to a file in your home directory. You should generally never need to os.chdir()
to find a file; just specify its path name if it's not in the current directory. See also What exactly is current working directory?