TLDR: Is there a clean way to make a list of entries for subprocess.check_output('pcregrep', '-M', '-e', pattern, file)?
I'm using python's subprocess.check_output()
to call pcregrep -M
. Normally I would separate results by calling splitlines()
but since I'm looking for a multiline pattern, that won't work. I'm having trouble finding a clean way to create a list of the matching patterns, where each entry of the list is an individual matching pattern.
Here's a simple example file I'm pcgrep'ing
module test_module(
input wire in0,
input wire in1,
input wire in2,
input wire cond,
input wire cond2,
output wire out0,
output wire out1
);
assign out0 = (in0 & in1 & in2);
assign out1 = cond1 ? in1 & in2 :
cond2 ? in1 || in2 :
in0;
Here's (some of) my python code
#!/usr/bin/env python
import subprocess, re
output_str = subprocess.check_output(['pcregrep', '-M', '-e',"^\s*assign\\s+\\bout0\\b[^;]+;",
"/home/<username>/pcregrep_file.sv"]).split(';')
# Print out the matches
for idx, line in enumerate(output_str):
print "output_str[%d] = %s" % (idx, line)
# Clear out the whitespace list entries
output_str = [line for line in output_str if re.match(\S+, line)]
Here is the output
output_str[0] =
assign out0 = in0 & in1 & in2
output_str[1] =
assign out1 = cond1 ? in1 & in2 :
cond2 ? in1 || in2 :
in0
output_str[2] =
It would be nice if I could do something like
output_list = subprocess.check_output('pcregrep', -M, -e, <pattern>, <file>).split(<multiline_delimiter>)
without creating garbage to clean up (whitespace list entries) or even to have a delimiter to split()
on that is independent on the pattern.
Is there a clean way to create a list of the matching multiline patterns?