4

I want to do an AWK-style range regex like this:

awk ' /hoststatus/,/\}/' file

In AWK this would print all the lines between the two patterns in a file:

hoststatus {
host_name=myhost
modified_attributes=0
check_command=check-host-alive
check_period=24x7
notification_period=workhours
check_interval=5.000000
retry_interval=1.000000
event_handler=
}

How do I do that in Ruby?

Bonus: How would you do it in Python?

This is really powerful in AWK, but I'm new to Ruby, and not sure how you'd do it. In Python I was also unable to find a solution.

2 Answers2

2

Ruby:

str =
"drdxrdx
hoststatus {
host_name=myhost
modified_attributes=0
check_command=check-host-alive
check_period=24x7
notification_period=workhours
check_interval=5.000000
retry_interval=1.000000
event_handler=
}"
str.each_line do |line|
  print line if line =~ /hoststatus/..line =~ /\}/
end

This is the infamous flip-flop.

Community
  • 1
  • 1
steenslag
  • 79,051
  • 16
  • 138
  • 171
  • @Tim Schaefer It is not a regex, it is two regexes switching each other on as the active one. AFAIK Ruby stole this from Perl, Perl stole it from AWK. – steenslag Oct 19 '12 at 22:25
1

with python passing in the multiline and dotall flags to re. The ? following the * makes it non-greedy

>>> import re
>>> with open('test.x') as f:
...     print re.findall('^hoststatus.*?\n\}$', f.read(), re.DOTALL + re.MULTILINE)
iruvar
  • 22,736
  • 7
  • 53
  • 82
  • Basically the same as mine, (but maybe a little better). Still, this has the disadvantage that you end up reading the entire file at once as opposed to reading line-by-line as I assume `awk` does. (I bet `awk` would do this faster too :) – mgilson Oct 19 '12 at 19:06
  • @mgilson, agreed. The only tool that I am aware of that can do multiline matches with full regex power and not just the ability to specify ranges, and without loading entire file into memory is pcregrep. – iruvar Oct 19 '12 at 19:22
  • I was not able to make this work. I've been trying lots of variations on this one and still can't get it to work in python. Thanks anyway... Tim – Tim Schaefer Oct 19 '12 at 23:54
  • @mgilson: performance depends on many things e.g., [python greps for an ip in a file faster than awk](http://stackoverflow.com/q/9350264/4279) (both version are not written for speed). It is not clear what is faster (for some input) re.findall() or awk's flip-flop – jfs Oct 20 '12 at 00:22