I'm trying to build a python script that will allow me dynamic build up on egrep -v attributes and pipe the output into less (or more).
The reason why I want to use external egrep+less is because files that I am processing are very large text files (500MB+). Reading them first into a list and processing all natively through Python is very slow.
However, when I use os.system or subprocess.call, everything is very slow at the moment I want to exit less output and return back to python code.
My code should work like this:
1. ./myless.py messages_500MB.txt
2. Less -FRX output of messages_500MB.txt is shown (complete file).
3. When I press 'q' to exit less -FRX, python code should take over and display prompt for user to enter text to be excluded. User enters it and I add this to the list
4. My python code builds up egrep -v 'exclude1' and pipes the output to less
5. User repeats step 3 and enters another stuff to be excluded
6. Now my python code calls egrep -v 'exclude1|exclude2' messages_500MB.txt | less -FRX
7. And the process continues
However, this does not work as expected.
* On my Mac, when user press q to exit less -FRX, it takes few seconds for raw_input prompt to be displayed
* On Linux machine, I get loads of 'egrep: writing output: Broken pipe'
* If, (linux only) while in less -FRX, I press CTRL+C, exiting less -FRX for some reason becomes much much quicker (as intended). On Mac, my python program breaks
Here is sample of my code:
excluded = list()
myInput = ''
while myInput != 'q':
grepText = '|'.join(excluded)
if grepText == '':
command = 'egrep "" ' + file + ' | less -FRX'
else:
command = 'egrep -v "' + grepText + '" ' + file + ' | less -FRX'
subprocess.call(command, shell=True)
myInput = raw_input('Enter text to exclude, q to exit, # to see what is excluded: ')
excluded.append(myInput)
Any help would be much appreciated