5

Given an HTTP request header, does anyone have suggestions or know of existing code to properly parse the header? I am trying to do this with Core Java only, no third party libs

Edit:

Trying to find key fields from this String for example:

GET / HTTP/1.1User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15Host: localhost:9000Accept: /

Want to parse out the Method and method

RandomUser
  • 4,140
  • 17
  • 58
  • 94
  • Define "properly parse". Do you reject if the header says something different than what you're expecting? Outside of that, you would be best served looking into Java Socket programming, which can read the raw bits from the line. – Makoto Aug 16 '12 at 14:43
  • I can receive the header and store it as a String, now what I am trying to figure out is the best way to parse the header to find key fields such as: host, method, etc. GET / HTTP/1.1User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15Host: localhost:9000Accept: */* – RandomUser Aug 16 '12 at 14:45

4 Answers4

5

I wrote a library, RawHTTP, whose only purpose is to parse HTTP messages (requests and responses).

If you don't want to use a library, you could copy the source into your own code base, starting form this: https://github.com/renatoathaydes/rawhttp/blob/a6588b116a4008e5b5840d4eb66374c0357b726d/rawhttp-core/src/main/java/com/athaydes/rawhttp/core/RawHttp.java#L52

This will split the lines of the HTTP message all the way to the end of the metadata sections (start-line + headers).

With the list of metadata lines at hand, you can then call the parseHeaders method, which will create the headers for you. You can easily adapt that to just return a Map<String, List<String>> to avoid having to also import the header classes.

That said... RawHTTP has no dependencies, so I would just use it instead :) but up to you.

Renato
  • 12,940
  • 3
  • 54
  • 85
4

Start by reading and understanding the HTTP specification.

The request line and headers are separated by CR LF sequences (bytes with decimal value 13 and 10), so you can read the stream and separate out each line. I believe that the headers must be encoded in US-ASCII, so you can simply convert bytes to characters and append to a StringBuilder (but check the spec: it may allow ISO-8859-1 or another encoding).

The end of the headers is signified by CR LF CR LF.

parsifal
  • 49
  • 1
4

Your concatenated one-line string is not a HTTP header.

A proper HTTP request message should be look like this (not always)

GET / HTTP/1.1 CRLF
Host: localhost:9000 CRLF
User-Agent: curl/7.19.7 blar blar CRLF
Accept: */* CRLF
Content-Length: ?? CRLF
...: ... CRLF
CRLF
octets

See here http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html

If you want implement a HTTP server without any help of Sevlets, JavaEE Containers, you should use Sockets.

  1. Read the first line [Request-Line = Method SP Request-URI SP HTTP-Version CRLF]
  2. Read the request header line by line till you got the blank line
  3. For each header line you can parse [fieldName: fieldValue]
  4. Read the entity body.

This is NOT the only case for HTTP message contracts.

Jin Kwon
  • 20,295
  • 14
  • 115
  • 184
0

I'm using the guava library to include preconditions for my methods. You can remove them in favor of null checks.

  /**
   * @return a string consisting of the HTTP headers, concatenating the keys and values delimited by
   * CFLR (empty line) capable of serialization to the database.
   */
  public static final String httpHeadersToString(final HttpResponse httpResponse) {
    Preconditions.checkNotNull(httpResponse);
    Preconditions.checkNotNull(httpResponse.getAllHeaders());

    final Header[] allHeaders = httpResponse.getAllHeaders();
    StringBuffer sb = new StringBuffer();
    int index = 0;
    while(index < allHeaders.length) {
      Header header = allHeaders[index];

      sb.append(header.getName())
         .append(System.getProperty("line.separator"))
         .append(header.getValue());

      if (++index < allHeaders.length) {
        sb.append(System.getProperty("line.separator"));
      }
    }
    return sb.toString();
  }

  /**
   * @return reconstruct HTTP headers from a string, delimited by CFLR (empty line).
   */
  public final HttpHeaders stringToHttpHeaders(final String headerContents) {
    HttpHeaders httpHeaders = new HttpHeaders();
    final String[] tempHeaderArray = headerContents.split(System.getProperty("line.separator"));
    int i = 0;
    while (i + 1 <= tempHeaderArray.length) {
      httpHeaders.add(tempHeaderArray[i++], tempHeaderArray[i++]);
    }
    return httpHeaders;
  }
amadib
  • 868
  • 14
  • 33
  • http://stackoverflow.com/questions/5757290/http-header-line-break-style may help understand exactly what's going on – amadib Jun 19 '15 at 11:48