How to use cut lines based on a pattern match?

Question

I have a file with the output as below

--B32934--
descr:     X
descr:     Y
descr:     Z
--B20484
descr:     A
descr:     B
descr:     C
--B41946
descr:     1
descr:     2
descr:     3
descr:     4

I just need the --BXXXX number and the first line, the rest of the lines need to be stripped out from the file. Is it possible to delete the lines >1 based on the pattern match --B?

For example under "--B32934" i just want "descr:X". The rest of the lines need to be deleted.

Desired result should look like this

--B32934--
descr:     X

Hello thanks for the reply. The desired result should look like this --B32934-- descr: X --B20484 descr: A --B41946 descr: 1 — venkyjack, Jun 14 '17 at 11:02
Edit additional information into the question. Too hard to read in the comments. And note that there are fancy buttons for formatting at the top. — fancyPants, Jun 14 '17 at 11:07
Possible duplicate of [Printing with sed or awk a line following a matching pattern](https://stackoverflow.com/questions/17908555/printing-with-sed-or-awk-a-line-following-a-matching-pattern) — Sundeep, Jun 14 '17 at 11:35

RomanPerekhrest · Answer 1 · 2017-06-14T11:34:31.513

2

Two approaches:

-- awk approach:

awk '/^--B[0-9]+/{ r=NR; print }(NR-1)==r' file

/^--B[0-9]+/ - if the line starts with pattern
r=NR - hold the record number
(NR-1)==r - if it's the next line after the pattern line

To capture the following 2 lines after the pattern line - use this approach:

awk '/^--B[0-9]+/{ r=NR; print }(NR-1)==r || (NR-2)==r' file

-- GNU sed approach:

sed -n '/--B[0-9]*/{N;p;}' file

N - add a newline to the pattern space, then append the next line of input to the pattern space (GNU extension)

The output (for both approaches):

--B32934--
descr: X
--B20484
descr: A
--B41946
descr: 1

edited Jun 14 '17 at 11:34

answered Jun 14 '17 at 11:08

RomanPerekhrest

88,541
4
65
105

my apologies, i am a bit to sed, when i try this on my mac, i get the error below sed -n '/--AS[0-9]*/{h;N;p}' test sed: 1: "/--AS[0-9]*/{h;N;p}": extra characters at the end of p command – venkyjack Jun 14 '17 at 11:14
@venkyjack, it was GNU sed. If so, use my awk approach – RomanPerekhrest Jun 14 '17 at 11:15
The awk commands work on my mac, but just for my info could you please let me know what the $0 and B[0-9] does? Thanks. I was asking because incase i need 2 lines i need to know what needs to be modified. – venkyjack Jun 14 '17 at 11:17
@RomanPerekhrest just add `;` to end to work with both sed versions... if you are using `[0-9]*` you might as well remove it... and `h` is not needed as well... `sed -n '/--B/{N;p;}'`... and can also use `grep -A1 --no-group-separator '\--B'` but required `GNU grep` – Sundeep Jun 14 '17 at 11:19
@venkyjack, see my details – RomanPerekhrest Jun 14 '17 at 11:21
@Sundeep Thanks for the sed command, that works on my mac, if i need two lines displayed how should i modify the command? – venkyjack Jun 14 '17 at 11:21
In `awk`: `$0~/something/` can be just `/something/` – hek2mgl Jun 14 '17 at 11:22
@venkyjack add as many `N;` as you need... `GNU grep` is easiest choice as you can directly give number to `-A` option – Sundeep Jun 14 '17 at 11:23
(You may want to change the explanation below as well) – hek2mgl Jun 14 '17 at 11:23
One more thing: `(NR-1)==r{ print }` can be just `NR-1==r` – hek2mgl Jun 14 '17 at 11:24
@hek2mgl, yeah, going to maximal shortening – RomanPerekhrest Jun 14 '17 at 11:26
@sundeep Thank you. In the SED an extra letter of N prints the two lines for me – venkyjack Jun 14 '17 at 11:26
@RomanPerekhrest Thank you for the explanation, its been very helpful. If i need to print 2 lines instead of 1 what variable do i change? I tried with an additional N but that stripped out all the lines :( – venkyjack Jun 14 '17 at 11:27
@venkyjack, this one: `sed -n '/--B[0-9]*/{N;N;p;}' file` – RomanPerekhrest Jun 14 '17 at 11:29
@RomanPerekhrest Thank you, with the sed i figured out its an extra N, just wondering what to modify with the awk you sent me earlier – venkyjack Jun 14 '17 at 11:32
@RomanPerekhrest Brilliant that worked. You guys are champions – venkyjack Jun 14 '17 at 11:40
@RomanPerekhrest in your awk command is there any way i can add a new blank line after each output? – venkyjack Jun 15 '17 at 09:46
@venkyjack, in what case? – RomanPerekhrest Jun 15 '17 at 10:46
After the descr: for each B[0-9] just wanted to check if a new blank line can be added in the command awk '/^--B[0-9]+/{ r=NR; print }(NR-1)==r || (NR-2)==r' – venkyjack Jun 15 '17 at 12:10
@venkyjack, unclear, exactly in what place newline should be added? – RomanPerekhrest Jun 15 '17 at 12:23
@RomanPerekhrest After each descr the blank line needs to be placed – venkyjack Jun 15 '17 at 12:53
@venkyjack, use this one: `awk '/^--B[0-9]+/{ r=NR; print }(NR-1)==r{ print $0 ORS }' file` – RomanPerekhrest Jun 15 '17 at 13:01

score 2 · Answer 2 · answered Jun 14 '17 at 11:18

2

You may use the following awk command:

awk 'p{print;p=0}/^-/{print;p=1}' file

Explanation:

If the variable p is true, print the current line, set p back to 0 (false). If the line starts with a -, print the current line and set p to 1 (true) (in order to print the next line).

answered Jun 14 '17 at 11:18

hek2mgl

152,036
28
249
266

Thank you. This works too, but if i need 2 lines instead of 1 what variable do i modify? – venkyjack Jun 14 '17 at 11:30
1

@venkyjack see https://stackoverflow.com/questions/17908555/printing-with-sed-or-awk-a-line-following-a-matching-pattern/17914105#17914105 for all sorts of things you can do... `awk '/^-/{p=3} p && p--'` would be one way for matching line and 2 afterwards – Sundeep Jun 14 '17 at 11:34
@Sundeep thank you, is there any way i can add a blank new line after each output? – venkyjack Jun 15 '17 at 09:41
Set the output record separator to two newlines: `awk 'p{print;p=0}/^-/{print;p=1}' ORS='\n\n' file` – hek2mgl Jun 15 '17 at 10:00
@hek2mgl thank you but this adds extra blank line between the B and the descr: , i just need between each set of B and descr: for eg: --B32934-- descr: X descr: Y descr: Z --B20484 descr: A descr: B descr: C – venkyjack Jun 15 '17 at 12:17

score 0 · Answer 3 · answered Jun 14 '17 at 16:05

0

This might work for you (GNU sed):

sed '/^--B\S\+/!d;n' file

Delete any line that does not begin with --B followed by one or more non-space characters. Otherwise print that line and the following.

answered Jun 14 '17 at 16:05

potong

55,640
6
51
83

score 0 · Answer 4 · answered Jun 14 '17 at 17:15

0

Here's another solution with grep

grep -e '^--B[0-9]\+' --no-group-separator -A 1 file

outputs:

--B32934--
descr:     X
--B20484
descr:     A
--B41946
descr:     1

answered Jun 14 '17 at 17:15

Josef

232
3
12

thank you, is there any way i can add a new blank line after each output? – venkyjack Jun 15 '17 at 10:16
@venkyjack Yep, simply use `grep -e '^--B[0-9]\+' -A 1 --group-separator "" file` – Josef Jun 15 '17 at 18:04

How to use cut lines based on a pattern match?

4 Answers4