2

I need to URL encode a string using a shell function that will run in BusyBox Ash, Dash, Bash and ZSH.

It will run in different Docker containers, so it'd be good to have as little dependencies to install as possible.

Notes:

  • The question "URL encoding a string in bash script" asks for a Bash-specific script, and the only provided answer depends on PHP being installed on the container.

  • The question "How to urlencode data for curl command?" is specific to curl, but is certainly open to non-specific answers. However, none of the 25 answers seem to apply. One of the answers there only works to send data to curl, while some are specific to bash or ksh, some others require Perl, PHP, Python, Lua, NodeJS, Ruby, gridsite-clients, uni2ascii, jq, awk or sed to be installed. One of them doesn't require additional dependencies, but doesn't preserve characters like a, 1 and ~.

What I'd expect to have:

$> urlencode '/'
%2f

$> urlencode 'ç'
%c3%a7

$> urlencode '*'
%2a

$> urlencode abc-123~6
abc-123~6

$> urlencode 'a test ?*ç '
a%20test%20%3f%2a%c3%a7%20
Community
  • 1
  • 1
Elifarley
  • 1,310
  • 3
  • 16
  • 23
  • 1
    seems to be duplicate look here http://stackoverflow.com/questions/296536/how-to-urlencode-data-for-curl-command – OkieOth Jun 24 '16 at 14:03

1 Answers1

5

The functions below have been tested in BusyBox Ash, Dash, Bash and ZSH.

They only use shell builtins or core commands and should work in other shells as well.

urlencodepipe() {
  local LANG=C; local c; while IFS= read -r c; do
    case $c in [a-zA-Z0-9.~_-]) printf "$c"; continue ;; esac
    printf "$c" | od -An -tx1 | tr ' ' % | tr -d '\n'
  done <<EOF
$(fold -w1)
EOF
  echo
}

urlencode() { printf "$*" | urlencodepipe ;}

How it works:

  • Standard input is processed by fold -w1, which re-formats its input so that it is 1 column wide (in other words, it adds a \n after each input character so that each character will be on its own line)
  • The here-document <<EOF feeds the output of fold to the read command
  • The while command accepts one line at a time (which is only 1 character wide) from the read command, which gets its input from fold, and assigns it to variable c
  • case tests if that character needs to be url encoded. If it doesn't, then it's printed and the while loop continues
  • od converts each input character to hex
  • tr converts spaces to % and joins multiple lines
Elifarley
  • 1,310
  • 3
  • 16
  • 23
  • 2
    Nice, though it's not perfectly portable, as the output from `od` varies from what you expect in non-Linux environments. For me, `urlencode ' '` returns `%%%%%%%%%%%20%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%`. Replacing the first `tr` with `sed 's/ */%/g;s/%$//'` seems to help, but it still fails to translate `\n` to `%0d`. – ghoti Aug 27 '18 at 20:58
  • This unfortunately does not work in ZSH on macOS (as @ghoti illustrates above). – michael_teter Nov 01 '22 at 11:12