3

I read another answer that show how one can set the field separator using the -F flag:

awk -F 'INFORMATION DATA ' '{print $2}' t

Now I'm curious how I can use a regex for the field separator. My attempt can be seen below:

$ echo "1 2 foo\n2 3 bar\n42 2 baz"
1 2 foo
2 3 bar
42 2 baz
$ echo "1 2 foo\n2 3 bar\n42 2 baz" | awk -F '\d+ \d+ ' '{ print $2 }'
# 3 blank lines

I was expecting to get the following output:

foo
bar
baz 

This is because my regex \d+ \d+ matches "the first 2 numbers separated by a space, followed by a space". But I'm printing the second record. As shown on rubular:

enter image description here

  • How do I use a regex as the awk field separator?
Community
  • 1
  • 1
mbigras
  • 7,664
  • 11
  • 50
  • 111
  • I'm awk I'm printing $2, the second record – mbigras Mar 28 '17 at 23:23
  • 2
    awk dow not support the perlish `\d` metacharacter. You would use the POSIX character class of `[[:digit:]]` instead of `\d`. https://www.gnu.org/software/gawk/manual/html_node/GNU-Regexp-Operators.html – dawg Mar 28 '17 at 23:37

2 Answers2

5

First of all echo doesn't auto escape and outputs a literal \n. So you'll need to add -e to enable escapes. Second of all awk doesn't support \d so you have to use [0-9] or [[:digit:]].

echo -e "1 2 foo\n2 3 bar\n42 2 baz" | awk -F '[0-9]+ [0-9]+ ' '{ print $2 }'

or

echo -e "1 2 foo\n2 3 bar\n42 2 baz" | awk -F '[[:digit:]]+ [[:digit:]]+ ' '{ print $2 }'

Both outputs:

foo
bar
baz 
vallentin
  • 23,478
  • 6
  • 59
  • 81
3

Just replace \d with [0-9]:

With this you can print all the fields and you can see the fields immediatelly:

$ echo -e "1 2 foo\n2 3 bar\n42 2 baz" |awk -v FS="[0-9]+ [0-9]+" '{for (k=1;k<=NF;k++) print k,$k}'
1 
2  foo
1 
2  bar
1 
2  baz

So just use [0-9] in your command:

$ echo -e "1 2 foo\n2 3 bar\n42 2 baz" |awk -v FS="[0-9]+ [0-9]+" '{print $2}'
 foo
 bar
 baz
George Vasiliou
  • 6,130
  • 2
  • 20
  • 27