Addressing the current issue of passing a regex to awk
, due to various issues with escape sequences it's usually easier to deal with variables instead of hard-coded regex patterns, combined with testing the entire line ($0
) against the pattern (~ pattern_variable
), eg:
string='"name": "Bash scripting cheatsheet",'
string2='"url": "https://devhints.io/bash"'
pattern='"([^"]*)".*"([^"]*)"'
$ awk -v ptn="${pattern}" -F'"' '$0 ~ ptn {print $2}' <<< "${string}"
"Bash
$ awk -v ptn="${pattern}" '$0 ~ ptn {print $2}' <<< "${string2}"
"https://devhints.io/bash"
OK, so we got awk
working with the regex but we're not getting quite what we wanted because by default awk
uses white space as the default field delimiter. We can tell awk
to use the double quote as a delimiter, and knowing that the value we want is between the 2nd set of double quotes:
$ awk -v ptn="${pattern}" -F'"' '$0 ~ ptn {print $4}' <<< "${string}"
Bash scripting cheatsheet
$ awk -v ptn="${pattern}" -F'"' '$0 ~ ptn {print $4}' <<< "${string2}"
https://devhints.io/bash
'course, this requires spawning a subprocess each time we want to parse a string.
There are a few (better) ways to parse a string in bash
without the overhead of spawning subprocess calls ...
One idea using some basic bash
regex matching:
string='"name": "Bash scripting cheatsheet",'
string2='"url": "https://devhints.io/bash"'
pattern='"([^"]*)".*"([^"]*)"'
If bash
finds a match it will populate the BASH_REMATCH[]
array with info about the match(es), with each capture group (the part of the pattern inside a set of parens) making up a separate entry in the array.
Consider:
$ [[ "${string}" =~ ${pattern} ]] && string_name_match="${BASH_REMATCH[2]}"
$ typeset -p BASH_REMATCH string_name_match
declare -ar BASH_REMATCH=([0]="\"name\": \"Bash scripting cheatsheet\"" [1]="name" [2]="Bash scripting cheatsheet")
declare -- string_name_match="Bash scripting cheatsheet"
$ echo "${string_name_match}"
Bash scripting cheatsheet
$ [[ "${string2}" =~ ${pattern} ]] && string2_url_match="${BASH_REMATCH[2]}"
$ typeset -p BASH_REMATCH string2_url_match
declare -ar BASH_REMATCH=([0]="\"url\": \"https://devhints.io/bash\"" [1]="url" [2]="https://devhints.io/bash")
declare -- string2_url_match="https://devhints.io/bash"
$ echo "${string2_url_match}"
https://devhints.io/bash