How to use sed and/or regex to trim a line in a file using bash?

Question

This seems like it should be simple, but I've spent far too much time searching. How can I use sed and regex to trim off all words in a line after the fourth word?

For instance from:

19900101, This is a title
19091110, This is a really long title

I would like to have

19900101, This is a
19091110, This is a

I've tried answers like this one Regex to extract first 3 words from a string, but I'm using Mac OSX, so I get context address errors.

The resource you link to uses a regex dialect which isn't supported by *any* `sed` version I am familiar with. You could try Perl, or figure out how to portably express things like `\s` in "traditional" regex. (It's not terribly hard. I'll post an answer to the linked question.) — tripleee, Mar 01 '17 at 04:45

score 4 · Accepted Answer · answered Feb 28 '17 at 15:53

4

This is easily done using cut:

cut -d ' ' -f 1-4 file

19900101, This is a
19091110, This is a

Or using awk:

awk '{NF=4} 1' file

19900101, This is a
19091110, This is a

answered Feb 28 '17 at 15:53

anubhava

761,203
64
569
643

1

Both of these work perfectly! Would you mind explaining the syntax? – westcoast_509 Feb 28 '17 at 18:48
awk command sets `NF` (# of fields) to 4 thus discarding everything after that. `cut` command cuts fields from position `1-4` – anubhava Feb 28 '17 at 19:17

score 0 · Answer 2 · answered Feb 28 '17 at 20:55

0

This might work for you (GNU sed):

sed 's/\s*\S*//5g' file

Remove the fifth or more words from the line.

answered Feb 28 '17 at 20:55

potong

55,640
6
51
83

How to use sed and/or regex to trim a line in a file using bash?

2 Answers2