Scripting - copy line and the second line IF the second line has a string

Question

I have a problem where I have a large amount of files that I need to scan and return a line and its following line, but only when the following line begins with a string.

String one - line one must begin with 'Bill'
String two - line two must begin with 'Jones'.

If these two criteria are matched, it returns the two lines. Repeat for the whole file.

ie. original file:

Edith Blue
Edith Green
Edith Red
Bill Blue
Jones Red
Edith Green
Bill Green
Edith Red
Jones Green
Bill Blue

I'd want it to return only:

Bill Blue
Jones Red

Any ideas? No idea where to begin with this, I only have basic scripting skills with sed/awk etc... At the moment I am using this to get the filename and its following line, but it is giving me too much useless information that I have to strip off with other sed commands.

grep -A 1 "^Bill" * > test.txt

I guess there's a far more elegant way of getting only the lines I need. Any help would be lovely!

What you really need is pcregrep. Have a look [here][1] [1]: http://stackoverflow.com/questions/152708/how-can-i-search-for-a-multiline-pattern-in-a-file-use-pcregrep/152711#152711 — Nehal Dattani, Oct 18 '13 at 18:09

beroe · Answer 1 · 2013-10-18T15:58:19.623

As an extension of your initial approach, a simple solution is to grep lines starting with "Bill" returning one after, then find lines starting with "Jones" returning one before....

grep -A1 "^Bill" myfile.txt | grep "^Jones" -B1

Output:

Bill Blue
Jones Red

Side note: as a true test, your input file should probably have some lines where Bill and Jones are not at the start of the line...

Edith Blue
Edith Jones
Edith Red
Bill Blue
Jones Red
Edith Bill
Bill Jones
Edith Red
Jones Green
Bill Blue

score 1 · Answer 2 · answered Oct 18 '13 at 15:25

1

Use the getline() instruction of awk for each line that begins with Bill:

awk '
    $1 ~ /^Bill/ { 
        getline l
        if ( l ~ /^Jones/ ) { 
            printf "%s\n%s\n", $0, l 
        } 
    }
' infile

It yields:

Bill Blue
Jones Red

answered Oct 18 '13 at 15:25

Birei

35,723
2
77
82

score 1 · Answer 3 · answered Oct 18 '13 at 15:29

1

And here is another way using awk with a flag:

$ awk '$1=="Bill"{p=1;a=$0;next};$1=="Jones"&&p{print a;print};{p=0}' file
Bill Blue
Jones Red

answered Oct 18 '13 at 15:29

user000001

32,226
12
81
108

m3h2014 · Answer 4 · 2013-10-18T17:02:39.600

1

Here is a simple python script:

FILE = 'test.text'

f = open(FILE,'r')

one = 'Bill'
two = 'Jones'

prev = ''

for line in f:
    if prev.startswith(one) and line.startswith(two):
        print prev,line.rstrip()
    prev = line

Yields:

python FileRead.py
Bill Blue
Jones Red

edited Oct 18 '13 at 17:02

answered Oct 18 '13 at 15:34

m3h2014

156
5

(Probably more efficient to use `.startswith` rather than `split` and test?) – beroe Oct 18 '13 at 16:00

score 0 · Answer 5 · answered Oct 18 '13 at 21:28

0

This might work for you (GNU sed):

sed -n '$!N;/^Bill.*\nJones/p;D' file

answered Oct 18 '13 at 21:28

potong

55,640
6
51
83

Scripting - copy line and the second line IF the second line has a string

5 Answers5