33

I am writing an HTTP parser for a transparent proxy. What is stumping me is the Trailer: mentioned in the specs for Transfer-Encoding: chunked. What does it look like?

Normally, a HTTP chunked ends like this.

0\r\n
\r\n

What I am confused about is how to detect the end of the chunk if there is some sort of trailing headers...

UPDATE: I believe that a simple \r\n\r\n i.e. an empty line is enough to detect the end of trailing headers... Is that correct?

unixman83
  • 9,421
  • 10
  • 68
  • 102
  • Thanks for posting this, I was wondering the same thing. What was throwing me off was that the 0 length chunk doesn't have it's own \r\n after the zero-length-data. It is clear now that I re-read the RFC again, but nice to see a clear example of how it looks with some header... wish they would add that to the RFC. – eselk Mar 23 '12 at 05:37
  • 2
    So... how do you detect chunked parts from stream with Gzip encoded? – Alexsandro Jun 19 '12 at 20:22
  • 1
    @Alexsandro_xpt - the message body is first compressed, then chunked, so that you can decode the chunk encoding without de-compressing anything. http://tools.ietf.org/html/rfc7230#section-3.3.1 – Hawkeye Parker Sep 03 '14 at 08:21

3 Answers3

18

Below is a copy of an example trailer I copied from The TCP/IP Guide site. trailer sample

As we can see, if we want to use trailer header, we need add a "Trailer:header_name" header field with a header name and then add the trailer header entity after chunked body area.

We can add 0 or more trailer headers in a HTTP body per the RFC. Section 4.1.2 of RFC7230 bans the use of following headers in trailer header area:

A sender MUST NOT generate a trailer that contains a field necessary for message framing (e.g., Transfer-Encoding and Content-Length), routing (e.g., Host), request modifiers (e.g., controls and conditionals in Section 5 of RFC7231), authentication (e.g., see RFC7235 and RFC6265), response control data (e.g., see Section 7.1 of RFC7231), or determining how to process the payload (e.g., Content-Encoding, Content-Type, Content-Range, and Trailer).

This means we can use other standard headers and custom headers in trailer header area.

Community
  • 1
  • 1
appleleaf
  • 897
  • 1
  • 9
  • 19
17

0\r\n
SomeAfterHeader: TheData \r\n
\r\n

In other words, it is sufficient to look for a \r\n\r\n, in layman's terms: a blank line. To detect the end of a chunked transmission. But it is very important that each chunk is read before doing this. Because the chunked data itself can contain blank lines which would erroneously be detected as the end of the stream.

unixman83
  • 9,421
  • 10
  • 68
  • 102
  • 2
    @unixman83: If your answer is not correct (as Hawkeye Parker indicated), you should either correct it or unmark this as the accepted answer. Don't mislead SO users. Many people, including me, take SO answers for granted, without reading all the comments, because it is often trustworthy. This seems to be an "Exception" that visitors should "Catch"!! – M-D May 12 '16 at 11:53
  • 1
    @HawkeyeParker The answer is correct. Looking for a blank line will always correctly detect the end of chunked data as long as you skip the chunks themselves and it will work weather there is a trailer or not. The processing you suggest ignores the existence of trailers, as even if you found the end of chunk marker, you must continue reading up to a blank line anyway, which may follow directly or follow after trailers. – Mecki Mar 08 '22 at 11:08
  • @Mecki Revisiting the [ABNF](https://datatracker.ietf.org/doc/html/rfc7230#section-4.1), I agree. Thanks for correcting! I have deleted my previous comment. – Hawkeye Parker Mar 09 '22 at 17:25
  • @M-D see Mecki's comment and my correction. You may want to delete your comment... – Hawkeye Parker Mar 09 '22 at 17:26
16

Regarding trailer:

The list of trailing headers should be specified in the Trailer header, as you note.

The BNF in Section 14.40 of RFC 2616 is this:

Trailer  = "Trailer" ":" 1#field-name

Gourley and Totty give this example:

Trailer: Content-Length

(It's odd that they give this example, since Content-Length is explicitly forbidden to be a trailing header in 14.40.)

Shiflett gives this example:

Trailer: Date

Regarding end of message with trailing headers:

The BNF in Section 3.6.1 of RFC 2616 is what you're looking for. Here's part:

Chunked-Body = *chunk
               last-chunk
               trailer
               CRLF
last-chunk   = 1*("0") [ chunk-extension ] CRLF
trailer      = *(entity-header CRLF)

So the last chunk and 2 trailing headers might look like this:

0<CRLF>
Date:Sun, 06 Nov 1994 08:49:37 GMT<CRLF>
Content-MD5:1B2M2Y8AsgTpgAmY7PhCfg==<CRLF>
<CRLF>
Community
  • 1
  • 1
james.garriss
  • 12,959
  • 7
  • 83
  • 96
  • 1
    Why on earth do people give examples which only demonstrate the simple cases???? What do you do if there are multiple headers in the Trailers? Do you use a comma-separated list or what? – brettwhiteman Nov 07 '14 at 22:32
  • 6
    Why on earth do people not bother to read the spec for themselves???? The answer to your question is already in my answer. Want a clue? It's 1#field. Want another? Go here: http://tools.ietf.org/html/rfc2616#section-2.1. – james.garriss Nov 08 '14 at 02:15