0

I'm fetching an event description from an API using curl and assigning the results to a variable in bash like this:

Event=$( curl -s -X GET https://api.vendor.com/v1/events/ev_$API_ID\
    -H 'Accept: application/json' \
    -u 'mykey:' )

EVTITLE=$(echo $Event | jq -r '.name')
DESC=$(echo $Event | jq -r '.description')

This is working well so far. But sometimes the EVTITLE or DESC strings have shell special chars in the strings like &, ! and sometimes quotes.

So, later, when I go to pass the variable to a sed command like this: (to replace values in a template file)

ti_sed="s/EVTITLE/"$EVTITLE"/"
sed -i -e "$ti_sed" filename

Where the value in $EVTITLE is something like

Amy does Q&A for you and "other things" !

I'd like to avoid having bash interpret those strings before sed goes to work. Is there a way to groom the strings so the final sed output looks like the input? For example can I get the string value of $EVTITLE between single quotes?

smartblonde
  • 138
  • 1
  • 11
  • 1
    Don’t set or expand `EVTITLE` without double quotes (which you do in the example): `EVTITLE="$(...)"`, `ti_sed="s/EVTITLE/${EVTITLE}/"` etc. You may need to replace (escape) other characters in the string though, such as both forward and back slashes: `EVTITLE="${EVTITLE//\//\\/}"`, `EVTITLE="${EVTITLE//\\/\\\\}"` etc. – Andrej Podzimek Sep 17 '21 at 04:30
  • @AndrejPodzimek thanks! here's what I would up doing so far: `Desc=$(echo $Event | jq -r '.description') p_sed='s/\

    //g' ep_sed='s/\<\/p\>//g' amp_sed='s/\&/+/g;s/amp\;//g' dquote_sed='s/\"//g' squote_sed="s/\'//g" DESC=$(echo $Desc | sed -e "$p_sed;$ep_sed;$amp_sed;$dquote_sed;$squote_sed")`

    – smartblonde Sep 17 '21 at 16:44

1 Answers1

1

Is there a way to groom the strings so the final sed output looks like the input?

Here's a bash demo script which reads strings from a temporary JSON file into an indexed array and has GNU sed write its own conversion script to edit a template. Note that \n, \r, \t, \u etc. in the JSON source will be converted by jq -r before bash and sed see them. The bash script reads newline-delimited lines and does not work for JSON strings containing \n.

More comments below.


#!/bin/bash
jsonfile="$(mktemp)"  templatefile="$(mktemp)"
# shellcheck disable=SC2064
trap "rm -f -- '${jsonfile}' '${templatefile}'" INT EXIT
cat << 'HERE' > "${jsonfile}"
{
  "Name":"A1",
  "Desc":"*A* \\1 /does/ 'Q&A' for you\tand \"other things\" \\@ $HOME !"
}
HERE
printf '%s\n' '---EVTITLE---' > "${templatefile}"

mapfile -t vars < <(
    jq -r '.Name, .Desc' < "${jsonfile}"
)
wait "$!" || exit   ## abort if jq failed
# shellcheck disable=SC2034
name="${vars[0]}"  desc="${vars[1]}"

printf '%s\n' "${desc}" |
tee /dev/stderr |
sed -e 's/[\\/&\n]/\\&/g' -e 's/.*/s\/EVTITLE\/&\//' | 
tee /dev/stderr |
sed -f /dev/stdin "${templatefile}"

These are the 3 lines output by the script (with tabs expanding to different lengths) showing the contents of:

  1. the shell variable desc
  2. the generated sed script
  3. the edited template file
*A* \1 /does/ 'Q&A' for you and "other things" \@ $HOME !
s/EVTITLE/*A* \\1 \/does\/ 'Q\&A' for you   and "other things" \\@ $HOME !/
---*A* \1 /does/ 'Q&A' for you  and "other things" \@ $HOME !---

bash stores the string it reads and passes it on without modification using printf to sed which in turn adds escapes as needed for a replacement string to be inserted between s/EVTITLE/ and /, i.e. the sed script required to edit the template file.

In the replacement section of a sed substitute command the following have a special meaning according to POSIX

  • \ (backslash) the escape character itself
  • the s command delimiter, default is / but it may be anything other than backslash and newline
  • & (ampersand) referencing the entire matched portion
  • \ ( is one of digits 1 through 9 ) referencing a matched group
  • a literal newline

but several seds recognize other escapes as replacements. For example, GNU sed will replace \f, \n, \t, \v etc. as in C, and (unless --posix option) its extensions \L, \l, \U, \u, and \E act on the replacement. (More on this by info sed -n 'The "s" Command', info sed -n Escapes, info sed --index-search POSIXLY_CORRECT.)

What this means is that all backslash, command delimiter, ampersand, and newline characters in the input must be escaped, i.e. prefixed with a backslash, if they are to represent themselves when used in a replacement section. This is done by asking sed to s/[\\/&\n]/\\&/g.

Recall that most of the meta characters used in regular expressions (and the shell, for that matter), such as ^$.*[]{}(), have no special meaning when appearing in the replacement section of sed's s command and so should not be escaped there. Contrariwise, & is not a regex meta character.

urznow
  • 1,576
  • 1
  • 4
  • 13