0

the following command print the file until the match WORD

 awk '1;/WORD/{exit}' file

but how to print the file from the string WORD until the end of file not include the string WORD?

maihabunash
  • 1,632
  • 9
  • 34
  • 60
  • The fantastic answer [here](http://stackoverflow.com/a/17914105/258523) includes an answer to your question. – Etan Reisner Mar 16 '15 at 16:39
  • That would not print until the string WORD, it would print until the end of the line that contains the string WORD. Post some sample input and expected output and place "WORD" in the middle of a line if that's possible in your real input. Also consider how you would want NEWORDER treated if it occurred mid-file - does the WORD in the middle of that match your string or not? – Ed Morton Mar 16 '15 at 17:06

4 Answers4

3

As Etan Reisner says in a comment, there is a nice cookbook of range patterns in this answer. But the simplest way to match from a pattern to the end of a file is:

awk '/WORD/,0' file  

In order to print from the line following the line containing a pattern, we could instead do this:

awk 'found,0;/WORD/{found=1}' file

To also print the part of the first line which matches WORD following WORD, it is only necessary to modify the last action, but it's convenient to replace the regular expression with an explicit call to match in order to set RSTART and RLENGTH:

awk 'found,0;match($0,/WORD/){found=1;print substr($0, RSTART+RLENGTH}'

Range patterns have the form expression,expression, and the meaning is to match from the first line which matches the first expression to the first line which matches the last expression, inclusively. The range is repeated until the file is fully processed.

In these examples, the second expression always evaluates to 0 (false), so the range never terminates and all lines are matched once the pattern succeeds.

Similarly, another way to solve the "print all lines until a pattern" would be the following, although it is less efficient because it reads the entire file:

awk 'NR==1,/WORD/' file

Also, if the goal is to print up to only the instance of the pattern (as opposed to the complete line containing th pattern, we could produce a simple modification of the original program:

awk 'match($0, /WORD/){print(sub($0,1,RSTART+RLENGTH)); exit}1'
Community
  • 1
  • 1
rici
  • 234,347
  • 28
  • 237
  • 341
1

This MIGHT be what you want:

$ cat file                                 
As market-days are wearing late,
And folk begin WORD to tak the gate;
While we sit bousin, at the nappy,
And gettin fou and unco happy,

$ awk '!f && sub(/.*WORD/,""){f=1} f' file
 to tak the gate;
While we sit bousin, at the nappy,
And gettin fou and unco happy,
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Yeah, but if WORD appears twice in the line? :) – rici Mar 16 '15 at 17:10
  • Then this may or may not do what he wants just like if WORD appeared a second time anywhere else in the file or if it appeared mid-word or .... The OP hasn't put a lot of thought into his requirements or if he has he hasn't told us about them yet! It was a good excuse to quote some Burns either way though :-). – Ed Morton Mar 16 '15 at 17:11
  • Absolutely true, the problem is underspecified. But "print from the string WORD" needs to be interpreted very loosely to generate "print from the last occurrence of WORD in the first line containing WORD". – rici Mar 16 '15 at 17:14
  • True but I suspect his real input has WORD on it's own on a line and I don't want to put much thought/effort into this until we see some decent requirements and sample input/output. Hopefully this will get the OP thinking... – Ed Morton Mar 16 '15 at 17:16
  • I was looking at a short solution, but it gives `WORD` at the end. Do you have any idea on how to fix it: `awk -v RS="WORD" 'NR>1' ORS="WORD"` It works if there are only one pattern in the file and we remove the `ORS` – Jotne Mar 16 '15 at 17:52
  • Change it to `awk -v RS="WORD" -v ORS= 'NR>1' file`. You need to set ORS to null to stop it from appending a trailing newline. – Ed Morton Mar 16 '15 at 19:36
0

If text have only one pattern, this gnu awk (gnu due to the RS) will work:

awk -v RS="WORD" 'NR>1' file

It will work as Eds solution, start with first data after WORD and print the rest of the line and all next line to the EOF


This will print the next line after the WORD is found and until EOF
If you need data on the same line after WORD look at Eds answer.

awk 'f;/WORD/{f=1}' file

Example, pattern four

cat file
1 one
2 two
3 three
4 four
5 five
6 six
7 seven
8 eight
9 nine
10 ten

awk 'f;/four/ {f=1}' file
5 five
6 six
7 seven
8 eight
9 nine
10 ten
Jotne
  • 40,548
  • 12
  • 51
  • 55
0

This might work for you (GNU sed):

sed '1,/WORD/{/WORD/!d;s//\n/;D}' file

This deletes all the lines up until WORD and then replaces WORD by a newline and delete up until and including the newline. The remaining file is printed as normal.

potong
  • 55,640
  • 6
  • 51
  • 83