I am having trouble finding a solution to utilize linux sort
command as an input to my python script.
For example I would like to iterate through the result of sort -mk1 <(cat file1.txt) <(cat file2.txt))
Normally I would use Popen
and iterate through it using next
and stdout.readline()
, something like:
import os
import subprocess
class Reader():
def __init__(self):
self.proc = subprocess.Popen(['sort -mk1', '<(', 'cat file1.txt', ')', '<(', 'cat file2.txt', ')'], stdout=subprocess.PIPE)
def __iter__(self):
return self
def __next__(self):
while True:
line = self.proc.stdout.readline()
if not line:
raise StopIteration
return line
p = Reader()
for line in p:
# only print certain lines based on some filter
With the above, I would get an error: No such file or directory: 'sort -mk1'
After doing some research, I guess I cant use Popen, and have to use os.execl
to utilize bin/bash
So now I try below:
import os
import subprocess
class Reader():
def __init__(self):
self.proc = os.execl('/bin/bash', '/bin/bash', '-c', 'set -o pipefail; sort -mk1 <(cat file1.txt) <(cat file2.txt)')
def __iter__(self):
return self
def __next__(self):
while True:
line = self.proc.stdout.readline()
if not line:
raise StopIteration
return line
p = Reader()
for line in p:
# only print certain lines based on some filter
The problem with this is that it actually prints all the lines right away. I guess one solution is to just pipe its results to a file, then in python I iterate through that file. But I dont really want to save it to a file then filter it, seems unneccessary. Yes I can use other linux commands such as awk
, but I would like to use python for further processing.
So questions are:
- Is there a way to make solution one with
Popen
to work? - How can I iterate through the output of
sort
using the second solution?