Counting lines after grep finds a string

Question

I'd like to count the instances of 'text' after each 'header'. I'm using grep and awk but open to any tools. My file looks like this:

header1
text1
text2
text3
header2
text1
header3
header4
text1
text2
...

A great output would look like this

header1 3
header2 2
header3 0
header4 2
...

My question is similar to this, but requires not counting the total occurrences and instead the occurrences between a certain string.

grep cannot do this, but awk is perfect. What have you tried? Please post some code. — Karoly Horvath, Aug 14 '13 at 10:11
Why `header2` is showing 2 when there is only `text1` below it? — anubhava, Aug 14 '13 at 10:41

user000001 · Answer 1 · 2013-08-14T10:29:35.637

4

This awk command does not store the entire file in memory:

awk '/^header/{if (head) print head,k;head=$1; k=0}!/^header/{k++}END{print head,k}' file

If you are only interested in counting the lines containing text, then change the script to this:

awk '/^header/{if (head) print head,k;head=$1; k=0}/text/{k++}END{print head,k}' file

edited Aug 14 '13 at 10:29

answered Aug 14 '13 at 10:17

user000001

fedorqui · Accepted Answer · 2013-08-14T10:24:51.723

2

With awk:

$  awk '{if (/header/) {h=$0; a[h]=0} if (/text/) {a[h]++}} END{for (i in a) {print i" "a[i]}}' file
header1 3
header2 1
header3 0
header4 2

{if (/header/) {h=$0; a[h]=0} if (/text/) {a[h]++}} fills the array a[] with the number of matches of each "text" line after each "header" line.
END{for (i in a) {print i" "a[i]}} prints the result after reading the file.

edited Aug 14 '13 at 10:24

answered Aug 14 '13 at 10:17

fedorqui

1

so what if the text contains `header`? – Karoly Horvath Aug 14 '13 at 10:20
1

It is a possibility. Just updated taken it into consideration. – fedorqui Aug 14 '13 at 10:25
it's safe to assume those texts don't literally contain/start with `text`. I'm pretty sure that was just an example. check the other posted solution. – Karoly Horvath Aug 14 '13 at 10:27
1

Check the question, please: `I'd like to count the instances of 'text' after each 'header'`. – fedorqui Aug 14 '13 at 10:27
2

Hmm.. let's see. @psny? – Karoly Horvath Aug 14 '13 at 10:28
1

@KarolyHorvath, the OP cannot get the notifications if he hasn't posted a comment in this answer. Why "Let's see?" it is clear on his question. My answer accomplishes it and I don't see any need to downvote. – fedorqui Aug 14 '13 at 10:30
1

I know he sees it. I was just indicating that at this point I'm waiting for him to resolve the dispute ;) – Karoly Horvath Aug 14 '13 at 10:32
1

It works. Thanks! note: the text doesn't contain header (and it is well defined) – philshem Aug 14 '13 at 11:09

2 Answers2