2

This seems like it should be simple, but I've spent far too much time searching. How can I use sed and regex to trim off all words in a line after the fourth word?

For instance from:

19900101, This is a title
19091110, This is a really long title

I would like to have

19900101, This is a
19091110, This is a

I've tried answers like this one Regex to extract first 3 words from a string, but I'm using Mac OSX, so I get context address errors.

Community
  • 1
  • 1
westcoast_509
  • 322
  • 1
  • 4
  • 13
  • The resource you link to uses a regex dialect which isn't supported by *any* `sed` version I am familiar with. You could try Perl, or figure out how to portably express things like `\s` in "traditional" regex. (It's not terribly hard. I'll post an answer to the linked question.) – tripleee Mar 01 '17 at 04:45

2 Answers2

4

This is easily done using cut:

cut -d ' ' -f 1-4 file

19900101, This is a
19091110, This is a

Or using awk:

awk '{NF=4} 1' file

19900101, This is a
19091110, This is a
anubhava
  • 761,203
  • 64
  • 569
  • 643
0

This might work for you (GNU sed):

sed 's/\s*\S*//5g' file

Remove the fifth or more words from the line.

potong
  • 55,640
  • 6
  • 51
  • 83