To split a string to an array in awk
we use the function split()
:
awk '{split($0, array, ":")}'
# \/ \___/ \_/
# | | |
# string | delimiter
# |
# array to store the pieces
If no separator is given, it uses the FS
, which defaults to the space:
$ awk '{split($0, array); print array[2]}' <<< "a:b c:d e"
c:d
We can give a separator, for example :
:
$ awk '{split($0, array, ":"); print array[2]}' <<< "a:b c:d e"
b c
Which is equivalent to setting it through the FS
:
$ awk -F: '{split($0, array); print array[2]}' <<< "a:b c:d e"
b c
In GNU Awk you can also provide the separator as a regexp:
$ awk '{split($0, array, ":*"); print array[2]}' <<< "a:::b c::d e
#note multiple :
b c
And even see what the delimiter was on every step by using its fourth parameter:
$ awk '{split($0, array, ":*", sep); print array[2]; print sep[1]}' <<< "a:::b c::d e"
b c
:::
Let's quote the man page of GNU awk:
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1]
, the second piece in array[2]
, and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used. split()
returns the number of elements created. seps is a gawk
extension, with seps[i]
being the separator string between array[i]
and array[i+1]
. If fieldsep is a single space, then any leading whitespace goes into seps[0]
and any trailing whitespace goes into seps[n]
, where n is the return value of split()
(i.e., the number of elements in array).