Combining multiple awk commands with different delimiters

Question

I run two awk command consecutively to break down a string based on multiple delimiters. I am wondering if they can be combined into a single command.

Input data (jot -w "some string, this is number " 10):

some string, this is number 1
some string, this is number 2
some string, this is number 3
some string, this is number 4
some string, this is number 5
some string, this is number 6
some string, this is number 7
some string, this is number 8
some string, this is number 9
some string, this is number 10

This is just example data, but I want to be able to split the string first based on the comma and then extract the number (fourth word) from the second part. In practice, the number of spaces in the first part of the string could vary, i.e. the following would be valid input:

some string, this is number 1
some string with more spaces, this is number 2

The following command works fine:

$ jot -w "some string, this is number " 10 | awk -F ',' '{print $2}' | awk -F ' ' '{print $4}'
1
2
3
4
5
6
7
8
9
10

Is there any simple way to combine both these commands into a single one?

`awk` takes regular expressions as delimiters, but if the part before the comma can have different numbers of fields, you might have to run two `awk`s. Perhaps you could use `cut` for the "simpler" part? — Jasper, May 05 '14 at 12:58
@jasper `cut` is definitely an option, thanks, and yes, I was aware that `awk` can take a regex as the delimiter, but even if the number of spaces doesn't vary before the comma, doing two consecutive awk statements is (in my application) more readable than having to change the index of the field in the second program. Thanks for the comment though. — zelanix, May 05 '14 at 13:06
Possible duplicate of [AWK multiple delimiter](https://stackoverflow.com/q/12204192/608639) — jww, Aug 15 '18 at 23:18

score 3 · Answer 1 · answered May 05 '14 at 13:08

3

The split() function will let you do this:

awk '{split($0,a,",");split(a[2],b," ");print b[4];}'

answered May 05 '14 at 13:08

Vaughn Cato

63,448
5
82
132

1

+1, but should probably explicitly state that one would normally write: `awk '{split($2,a," "); print a[4]}' FS=,` and use the normal field splitting for the first delimiter. – William Pursell May 05 '14 at 13:38

score 2 · Answer 2 · answered May 05 '14 at 13:06

2

You can use NF to print the last column easily

jot -w "some string, this is number " 10 |awk '{print $NF}'

Or follow your idea, and merge two awk into one.

jot -w "some string, this is number " 10  |awk '{l=split($2,a,OFS);print a[l]}' FS=","

answered May 05 '14 at 13:06

BMW

42,880
12
99
116

score 1 · Accepted Answer · answered May 05 '14 at 20:27

1

To solve the problem you describe would be:

$ cat file
some string, this is number 1
some string with more spaces, this is number 2

$ awk -F, '{n=split($NF,a,/ /); print a[n]}' file
1
2

or if you like golf:

$ awk -F, '{print a[split($NF,a,/ /)]}' file
1
2

but obviously with the input you specified this would work:

$ awk '{print $NF}' file
1
2

as would various other solutions.

answered May 05 '14 at 20:27

Ed Morton

188,023
17
78
185

Since this answer is accepted, then I have to give some suggestions on it. the split function with `/ / ` in most case is fine, but not always, recommend to change to OFS (because FS has been used and set to other value.) – BMW May 09 '14 at 00:32
The 3rd arg for `split` is a field separator which is a regexp with special handling for a single blank char, `" "`. As such the correct delimiters for an RE constant are the RE delimiters of `/.../` for both clarity and functionality (e.g. if you used string delimiters `"..."` then you'd need to double-escape any RE metacharacters to have them treated as literals). Splitting Input using the Output Field Separator just because it co-incidentally happens to be set to the same character that you want to split() your input on would just be unnecessary coupling and obfuscation. – Ed Morton May 09 '14 at 01:39

Combining multiple awk commands with different delimiters

3 Answers3