1

I would like to filter the lines containing "pattern" and the following 5 lines.

Something like grep -v -A 5 'pattern' myfile.txt with output:

other
other
other
other
other
other

I'm interested in linux shell solutions, grep, awk, sed... Thx

myfile.txt:

other
other
other
pattern
follow1
follow2
follow3
follow4
follow5
other
other
other
pattern
follow1
follow2
follow3
follow4
follow5
other
other
other
other
other
other
zsd
  • 440
  • 5
  • 17

3 Answers3

4

You can use awk:

awk '/pattern/{c=5;next} !(c&&c--)' file

Basically: We are decreasing the integer c on every row of input. We are printing lines when c is 0. *(see below) Note: c will be automatically initialized with 0 by awk upon it's first usage.

When the word pattern is found, we set c to 5 which makes c--<=0 false for 5 lines and makes awk not print those lines.


* We could bascially use c--<=0 to check if c is less or equal than 0. But when there are many(!) lines between the occurrences of the word pattern, c could overflow. To avoid that, oguz ismail suggested to implement the check like this:

!(c&&c--)

This will check if c is trueish (greater zero) and only then decrement c. c will never be less than 0 and therefore not overflow. The inversion of this check !(...) makes awk print the correct lines.


Side-note: Normally you would use the word regexp if you mean a regular expression, not pattern.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • change that to `!(c&&c--)` and I'll delete my answer. with huge files `c--` will be a problem – oguz ismail May 13 '19 at 08:45
  • @oguzismail You should keep yours. Thanks for the hint!! – hek2mgl May 13 '19 at 08:49
  • I would if this wasn't the millionth time this question asked, but nah, thanks – oguz ismail May 13 '19 at 08:52
  • Please change `pattern` to `regexp`. We're not knitting a sweater :-). – Ed Morton May 13 '19 at 14:10
  • I know you are picky about this and I'm normally using `regexp` when I mean a regular expression. In this case it's just that the OP used the word `pattern` in the question and keeping it makes my example executable with the given input. – hek2mgl May 13 '19 at 16:00
  • I rephrased this a bit to make clear that I'm referring to the literal word `pattern` and added a note ;) – hek2mgl May 13 '19 at 16:02
2

With GNU sed (should be okay as Linux is mentioned by OP)

sed '/pattern/,+5d' ip.txt

which deletes the lines matching the given regex and 5 lines that follow

Sundeep
  • 23,246
  • 2
  • 28
  • 103
0

I did it using this:

head -$(wc -l myfile.txt | awk '{print $1-5 }') myfile.txt | grep -v "whatever"

which means:

wc -l myfile.txt    : how many lines (but it also shows the filename)
awk '{print $1}'    : only show the amount of lines
awk '{print $1-5 }' : we don't want the last five lines
head ...            : show the first ... lines (which means, leave out the last five)
grep -v "..."       : this part you know :-)
Dominique
  • 16,450
  • 15
  • 56
  • 112