0

RFC 3875 defines the CONTENT_LENGTH in this way:

The CONTENT_LENGTH variable contains the size of the message-body attached to the request, if any, in decimal number of octets. If no data is attached, then NULL (or unset).

CONTENT_LENGTH = "" | 1*digit

The server MUST set this meta-variable if and only if the request is accompanied by a message-body entity. The CONTENT_LENGTH value must reflect the length of the message-body after the server has removed any transfer-codings or content-codings.

I am not sure which size this is.

Is it the size of the data the clients wants to send, which means the value of the HTTP request header "Content-Length"?

Or is it the size of the data the HTTP server read from the request before the data is send to the CGI?

I ask because I am not sure who is responsible to check, that the amount of data, the client wants to send matches the amount of data actually arrived. Is the HTTP server responsible to check that the size matches the value in the request header, or is the CGI responsible to count the bytes, which arrive on STDIN? And if so how gets the CGI the value of the request header?

Right now I do this in my CGI:

cat > $TMP/upload.csv
size=$(stat -c %s "$TMP/upload.csv")
if [ "$CONTENT_LENGTH" != "$size" ]; then
  echo "Size missmatch"
  return -1
fi

But is it the right thing to do?

Community
  • 1
  • 1
ceving
  • 21,900
  • 13
  • 104
  • 178

1 Answers1

1

The HTTP Content-Length already discussed here What's the "Content-Length" field in HTTP header? is defining the length of the body part only for the HTTP's protocol scope.

If your body is transmitted binary, the Content-Length is the same as the data length within the HTTP request.

When your body is 7-bit ASCII encoded, the Content-Length is usually higher than the actual data.

To summarize: Content-Length need be computed before transmitting the body of the HTTP request.

In your case, you may not compare the Content-Length with the file length. These are disconnected values.

Your file may be transmitted with multiple HTTP requests and answers with partial content, and each HTTP request and answers will get its own Content-Length.

If your intent is to do integrity check ; then use HTTPS for transmission integrity check, or use check-sums, or other protocols like rsync with built-in integrity check.

Léa Gris
  • 17,497
  • 4
  • 32
  • 41
  • I do the post with `application/octet-stream` without encoding. – ceving Jan 21 '20 at 11:59
  • This does not answer the question. You talk about the HTTP header `Content-Length`. I talk about the CGI variable `CONTENT_LENGTH`. – ceving Jan 21 '20 at 12:05
  • Well I could relieve the statement that `Content-Length` is disconnected from File length. Some downloaders manages partial download by looking at the file length and requesting the remaining at computed offset. Anyway comparing size with `Content-Lengh` is a very weak check, to continue a partial transmission. A more elaborate integrity check is still better. – Léa Gris Jan 21 '20 at 12:05
  • @ceving the `CGI` variable `CONTENT_LENGTH` is just the copy of the `Content-Length: ` `HTTP` header. – Léa Gris Jan 21 '20 at 12:07
  • Really? I gave some arguments in my question, that this might be wrong. – ceving Jan 21 '20 at 12:09