2

I am trying to search for last occurrence of the pattern in the file and delete everything after the line containing last pattern. I wonder if its possible using awk or sed. thanks in Advance.

aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
eeeeee fffff gggg
qqqq eeee rrrr 

desired output:

aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
fedorqui
  • 275,237
  • 103
  • 548
  • 598

4 Answers4

7

tac to the rescue:

$ tac b | awk '/pattern/ {p=1}p' | tac
aaaaaa bbbbb cccccc
aaaaaa pattern dddddd

Another example:

$ cat a
aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
eeeeee fffff gggg
aaaaaa pattern dddddd
qqqq eeee rrrr
$ tac a | awk '/pattern/ {p=1}p' | tac
aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
eeeeee fffff gggg
aaaaaa pattern dddddd
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    Also: `tac a | sed -n '/pattern/,$p' | tac` – William Pursell Sep 27 '13 at 11:37
  • what if I want to search for first pattern and delte everything above it including the pattern line ? –  Sep 27 '13 at 12:23
  • 1
    @user2809888 this is what you asked before: http://stackoverflow.com/questions/19047312/delete-everything-before-pattern-including-pattern-using-awk-or-sed – fedorqui Sep 27 '13 at 12:25
2
awk '
    BEGIN { ARGV[ARGC++] = ARGV[ARGC-1] }
    NR==FNR { if (/pattern/) lastLine = NR; next }
    { print }
    FNR == lastLine { exit }
' file

To demonstrate how postfix works above (see comments below):

$ awk 'BEGIN{ i=3; a[i++] = i; for (j in a) print j, a[j]; print i }'
3 3
4
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I can always learn things from your answer, I think this time too. Can you explain the begin block? I knew that here we want to read the input file twice by append a ARGV. but this line I cannot understand: `ARGV[ARGC++] = ARGV[ARGC-1]` say, we have `awk '..' file` then we have `ARGC=2; V[0]=awk; V[1]=file` now we do `V[ARGC++]...` then we assign `V[2]=something` now the `ARGC` is 3. the right side, `ARGV[ARGC-1]` is actually `V[3-1]` but `V[2]` is empty. why after `ARGC++` didn't change the `ARGC`? – Kent Sep 27 '13 at 15:12
  • 1
    @Kent the only "tricky" part is that `++` is a postfix operator and so doesn't occur until AFTER the whole statement is evaluated so it's equivalent to `ARGV[ARGC] = ARGV[ARGC-1]; ARGC++`, i.e. ARGC retains its original value until after the assignment. – Ed Morton Sep 27 '13 at 15:17
  • @Kent I updated my answer just to show how postifx works in a small script. – Ed Morton Sep 27 '13 at 15:25
  • 1
    thank you so much, I tested it a bit just now `awk 'BEGIN{x=5;a[x++]=x;print a[5]}'` it printed 5, not 6. it is interesting. not like other programming languages. good to know. thank you for the answer and the explanation. +1, ++1 ;) – Kent Sep 27 '13 at 15:30
  • I suspect the difference might be related to interpreted vs compiled languages. If I could get my `awkcc` to work, it'd be an interesting test! – Ed Morton Sep 27 '13 at 16:16
  • I don't know so many languages. Python has no `x++`, I know in `java`, it works not like awk does. I don't have much experience with `C` family, so I cannot tell, but I guess it would behave same as java. – Kent Sep 27 '13 at 16:20
  • @Kent I just discovered that it is implementation-dependent according to the bottom section of gnu.org/software/gawk/manual/html_node/Increment-Ops.html. Sorry if I led you astray! – Ed Morton Oct 01 '13 at 03:13
1

I have this line, should work for your requirement:

awk '/pattern/{_=NR}{a[NR]=$0}END{for(i=1;i<=_;i++)print a[i]}' file

I did a small test:

kent$  cat f
aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
eeeeee fffff gggg
qqqq eeee rrrr 
aaaaaa pattern dddddd
111
222

kent$  awk '/pattern/{_=NR}{a[NR]=$0}END{for(i=1;i<=_;i++)print a[i]}' f
aaaaaa bbbbb cccccc
aaaaaa pattern dddddd
eeeeee fffff gggg
qqqq eeee rrrr 
aaaaaa pattern dddddd
Kent
  • 189,393
  • 32
  • 233
  • 301
  • 1
    @fedorqui I found that too.. now it should give the right thing. – Kent Sep 27 '13 at 11:36
  • What does the `i<=_` part mean in `(i=1;i<=_;i++)`? – fedorqui Sep 27 '13 at 11:38
  • @EdMorton because it should be the *lastline* of matching. `_` looks like the last line. :D – Kent Sep 27 '13 at 12:10
  • I don't get it. How does `_` look like the last line? Not trying to be difficult, just honestly trying to understand as I seem to be missing the point. – Ed Morton Sep 27 '13 at 12:12
  • what if I want to search for first pattern and delte everything above it including the pattern line ? –  Sep 27 '13 at 12:20
  • 1
    @user2809888 http://stackoverflow.com/questions/19047312/delete-everything-before-pattern-including-pattern-using-awk-or-sed – Kent Sep 27 '13 at 12:42
  • Id did give thumbs down for this since the non intuitive variable `_` was used for no reason and no explanation. Using a common sense variable makes script easy to read and understand for all. – Jotne Sep 27 '13 at 13:19
  • 2
    @Jotne You can have your own opinion for sure. I agree with you, `_` doesn't look like other common var names, easy to be read, but personally I don't think the answer deserves a downvote, even if it could be written better. 1) it works 2) corner cases were considered as well 3) `_` is a valid variable name in awk. that is, we can use it. 4) the `_` is underscore, for me it looks like the `Lastline in a file`. so I used it. Would you vote `awk` down too (if you could) since awk allows a var named `_` ? Or you can suggest, when should we use the var name `_`? – Kent Sep 27 '13 at 13:44
  • I wouldn't downvote it, but the intent of allowing `_` as a character in a name is to separate words in a general naming scheme (e.g. `foo_bar` vs `fooBar`) or to prefix names that are in some way "special" (e.g. local variables declared as function args, `foo(realArg, _localVar)`). In C, macro-local vars should start with `_` and compiler symbols with double `__` but there's no equivalent awk conventions. All entities in software should have a name that identifies what they are, as such you would never name any symbol `_` as that tells you nothing at all about the entity with that name. – Ed Morton Sep 27 '13 at 15:00
  • 1
    I can not remove the downvote. I agree its a valid character, but a simple `b` would do. – Jotne Sep 27 '13 at 21:34
0

This might work for you (GNU sed):

sed -r '/pattern/{x;/./p;d};x;/./!{x;b};x;H;$!d;x;P;d' file
potong
  • 55,640
  • 6
  • 51
  • 83