2

I'm calling curl from a Bash shell script. Is there a way to pass a value to -Hthat contains embedded newline or other arbitrary characters? E.g., say I have a string such as:

This is line one.
This is line two.

This is line four (line three was blank).
Lots of "special" and 'funny' characters might lurk here, you know?

The key is to have the newline (or other such character that requires special handling) be encoded so that it is passed transparently through HTTP and viewed as part of the header content by the server.

jetset
  • 388
  • 4
  • 14

2 Answers2

1

Before finishing the question, I looked into this some more and found the answer: RFC 5987 encoding (assuming the HTTP server at the other end handles that correctly).

I was able to do this in my BASH script, thanks to an answer to a related question (how to URL-encode within BASH). See the answer Here is the pure BASH answer to the question URLEncode from a bash script

My code to do the encoding:

#-----------------------------------------------
# RFC 5987 encode a string
#
# Input string is in parameter $1.
# Result is stored in global variable URL_ENCODED_STR
#-----------------------------------------------
function rfc5987_encode ()
{
  local string="${1}"
  local strlen=${#string}
  local encoded=
  local pos
  local c
  local o
  local TICK=\'

    #
    # Set up encoded string preamble, which is:
    #   charset  "'" [ language ] "'" value-chars
    #

    encoded="UTF-8${TICK}${TICK}"


    #
    # Loop through string, examining each character.
    # Safe characters are copied to new string as-is.
    # Unsafe characters are copied as '%' and the hex code
    # of the character (using the bash built-in 'printf').
    #
    # Safe characters are:
    #   ALPHA / DIGIT
    #   "!" / "#" / "$" / "&" / "+" / "-" / "."
    #   "^" / "_" / "\`" / "|" / "~"
    #
    for (( pos=0 ; pos < strlen ; pos++ )); do

        c=${string:$pos:1}  # 'c' is current character

        case "$c" in
          [\!\#$\&+-.\^_\`\|~a-zA-Z0-9] ) # safe characters copied as-is
                o="${c}"
                ;;
          * )                         # everything else is encoded
                printf -v o '%%%02x' "'$c"
        esac

        encoded+="${o}"  # 'o' is output character
    done

  URL_ENCODED_STR="${encoded}"
}

In accordance with RFC 5987, one also has to add an asterisk to the end of the header field name.

Using this, the multi-line string:

This is line one.
This is line two.

This is line four (line three was blank).
Lots of "special" and 'funny' characters might lurk here, you know?

When sent in a header field named X-Foo-Caption, ends up as:

curl -H X-Foo-Caption*:UTF-8''This%20is%20line%20one.%0dThis%20is%20line%20two.%0d%0dThis%20is%20line%20four%20%28line%20three%20was%20blank%29.%0dLots%20of%20%22special%22%20and%20%27funny%27%20characters%20might%20lurk%20here,%20you%20know%3f -H 'X-Smug-Keywords: blank;lines;funny;characters;weird special stuff;who knows?;

To my utter amazement, the server handles this just fine.

Note that this is not URL encoding. URL encoding is used for URLs, while RFC 5987 encoding is used for HTTP headers. The end results are often different, because the two have different sets of safe characters and slightly different outputs. Examples:

Original     URL-encoded     RFC 5987 Encoding:
========     ===========     ==================
"a space"    a%20space       UTF-8''a%20space  
"foo"        foo             UTF-8''foo  
"100%"       100%25          UTF-8''100%25  
"$10.30"     %2410.30        UTF-8''$10.30  
"#1 fun"     %231%20fun      UTF-8''#1%20fun

Note also that the HTTP header needs to have an asterisk appended, to indicate that the value has been RFC 5987 encoded, so X-Foo: #1 fun gets sent in HTTP as X-Foo*: UTF-8''#1%20fun

Community
  • 1
  • 1
jetset
  • 388
  • 4
  • 14
  • Be careful this doesn't work well with wide characters. E.g., `€`. I don't think it's a good idea to use Bash for this anyway, there are lots of other languages (Perl, Python) that will be very happy to do that for you, in a robust, consistent and more efficient way. – gniourf_gniourf Apr 30 '14 at 23:24
  • I thought that telling you that your code is broken with wide characters would be helpful. Sorry. – gniourf_gniourf May 22 '14 at 06:10
0

I would suggest netcat might be an easier way to do this. Or perhaps you need a tool that is more tailored to things like fuzz testing or penetration testing.

echo -en 'GET / HTTP/1.1\r\nHost: example.com\r\nFunny: ho\0ho\nho\r\n\r\n' | nc IP PORT

If you're aim is in testing software, in the past I've had success using a tool called abnfgen to help generate (correct) test-cases according to a grammar... you could try that and supply a modified grammar perhaps... influenced by a reading of CVEs for similar software. NUL and line-ending are always worth sticky in various places.

Cameron Kerr
  • 1,725
  • 16
  • 23
  • This is not actually for fuzz testing, it's just a script to upload images to a photo site, so `curl` is handy because it handles cookies and other HTTP stuff. – jetset May 01 '14 at 01:37