10

I am facing a problem when a remote web client with slow connection fails to send complete POST request with multipart/form-data content but PHP still uses partially received data to populate $_POST array. As a result one value in $_POST array can be incomplete and more values can be missing. I tried to ask same question in Apache list first and got an answer that Apache doesn't buffer the request body and passes it to PHP module as a giant blob.

Here is my sample POST request:

POST /test.php HTTP/1.0
Connection: close
Content-Length: 10000
Content-Type: multipart/form-data; boundary=ABCDEF

--ABCDEF
Content-Disposition: form-data; name="a"

A
--ABCDEF

You can see that Content-Length is 10000 bytes, but I send just one var a=A.

The PHP script is:

<?php print_r($_REQUEST); ?>

Web server waits for about 10 seconds for the rest of my request (but I don't send anything) and then returns this response:

HTTP/1.1 200 OK
Date: Wed, 27 Nov 2013 19:42:20 GMT
Server: Apache/2.2.22 (Debian)
X-Powered-By: PHP/5.4.4-14+deb7u3
Vary: Accept-Encoding
Content-Length: 23
Connection: close
Content-Type: text/html

Array
(
     [a] => A
)

So here is my question: How can I verify in PHP that the post request was received completely? $_SERVER['CONTENT_LENGTH'] would show 10000 from the request header, but is there a way to check the real content length received?

spatar
  • 550
  • 1
  • 5
  • 15
  • could suhosin be your answer? http://stackoverflow.com/a/8451656/1001641 – Naveed Hasan Nov 28 '13 at 00:11
  • @Naveed From [Suhosin feature list](http://www.hardened-php.net/suhosin/a_feature_list.html) it seems that they only support limits on variable names/values length but that is not what i need. I need to verify that Content-Length matches the size of the real message body received. – spatar Nov 28 '13 at 00:59
  • The remote client is actually a browser with HTML page? – MeNa Dec 04 '13 at 09:35
  • @MeNa No, it is a custom application, I can make modifications to it. – spatar Dec 04 '13 at 17:24

12 Answers12

3

I guess that the remote client is actually a browser with HTML page. otherwise, let me know and i'll try to adapt my solution.

You can add field <input type="hidden" name="complete"> (for example) as the last parameter. in PHP check firstly whether this parameter was sent from client. if this parameter sent - you can be sure that you got the entire data.

Now, i'm not sure that the order of parameters must be preserved according the RFC (of both, HTML and HTTP). but i've tried some variations and i saw that the order kept indeed.

Better solution will be, calculate (on client side) hash of the parameters and send him as another parameter. so you can be absolutely sure that you got the entire data. But this is starting to sound complicated...

MeNa
  • 1,467
  • 9
  • 23
  • That was my idea too - to add one more parameter at the end. But I worry if some proxy in the middle can reassemble request in a different order. I wonder if I can just get real POST size received in PHP. – spatar Dec 04 '13 at 17:26
  • 1
    What about the last part of my answer? Send the hash of the parameters too, then on your server calculate the hash again and compare. – MeNa Dec 04 '13 at 19:04
  • I think this is the best I can do but I want to wait a little bit longer for a better solution. – spatar Dec 08 '13 at 07:13
2

As far as I know there is no way to check if the size of received content matches the value of the Content-Length header when using multipart/form-data as Content-Type, because you cannot get hold of the raw content.

1) If you can change Content-Type (to application/x-www-form-urlencoded for example) you can read php://input, which will contain the raw content of the request. The size of php://input should match Content-Length (assuming the value of Content-Length is correct). If there's a match, you can still use $_POST to get the processed content (regular post data). Read about php://input here.

2) Or you can serialize the data on the client and send it as text/plain. The server can check the size the same way as described above. The server will need to unserialize the received content to be able to work with it. And if the client generates a hash of the serialized data and send it along in a header (X-Content-Hash for example), the server can also generate a hash and check if it matches the one in the header. You won't need to check the hash, and can be a 100% sure the content is correct.

3) If you cannot change Content-Type, you'll need something different from size to verify the content. The client could use an extra header (something like X-Form-Data-Fields) to sum up the fields/keys/names of the content you're sending. The server could then check if all fields mentioned in the header are present in the content.

4) Another solution would be for the client to have a predefined key/value as last entry in the content. Something like:

--boundary
Content-Disposition: form-data; name="_final_field_"

TRUE
--boundary--

The server can check if that field is present in the content, if so the content must be complete.

update

When you need to pass binary data, you can't use option 1, but can still use option 2:

The client can base64 encode the binary entries, serialize the data (with any technique you like), generate a hash of the serialized data, send the hash as header and data as body. The server can generate a hash of the received content, check the hash with the one in the header (and report a mismatch), unserialize the content, base64 decode the binary entries.

This is a bit more work then plainly using multipart/form-data, but the server can verify with a 100% guarantee the content is the same as what the client sent.

Jasper N. Brouwer
  • 21,517
  • 4
  • 52
  • 76
  • I pass some binary data so I don't want to use `application/x-www-form-urlencoded` because it will increase data size. I like the idea of using custom headers and I think your solution can be simplified by passing total length of all parameter names and values in a header and checking it in PHP using `$_REQUEST` or `$_POST`. As for adding a `_final_field_`, MeNo already mentioned this idea. – spatar Dec 08 '13 at 07:18
  • Yes, with binary data option 1 won't do. I've updated my answer accordingly. – Jasper N. Brouwer Dec 09 '13 at 09:11
1

If you can change the enctype to

multipart/form-data-alternate

the you can check

strlen(file_get_contents('php://input'))

vs.

$_SERVER['CONTENT_LENGTH']
corretge
  • 1,751
  • 11
  • 24
1

This is a known bug in PHP and needs to be fixed there - https://bugs.php.net/bug.php?id=61471

mp_de
  • 21
  • 2
  • While the link can answer to the question it is better to include all relevant parts of its content in your answer and put the link only as reference... – DaFois Dec 12 '17 at 15:42
0

They probably get cutoff by limits in Apache or PHP. I believe Apache also has a config variable for this.

Here are the PHP settings;

php.ini

post_max_size=20M
upload_max_filesize=20M

.htaccess

php_value post_max_size 20M
php_value upload_max_filesize 20M
vokx
  • 76
  • 2
  • It is not about size limits. POST size is just a few kilobytes. The problem here is if the connection is lost while posting request, PHP still interprets partially received data, no matter that Content-Length did not match. – spatar Nov 28 '13 at 00:48
0

Regarding form values that are completely missing due to connectivity issues, you can just check if they are set:

if(isset($_POST['key']){
    //value is set
}else{
    //connection was interrupted
}

For the large form data (such as an image upload) you could check the size of the received file using

$_FILES['key']['size']

A simple solution might use JavaScript to calculate the file size on the client side, and append that value to the form as a hidden input on form submission. You get the file size in JS using something like

var filesize = input.files[0].size;

Reference: JavaScript file upload size validation

Then on file upload, if the hidden form input's value matches the size of the uploaded file, the request was not interrupted by network connectivity issues.

Community
  • 1
  • 1
nightowl
  • 41
  • 5
  • You can not validate content-length this way. First of all, you need to count size of all multipart boundaries. In addition multipart can have preamble/epilogue of any size that is counted in content-length but that you don't receive in PHP. – spatar Dec 04 '13 at 17:29
  • And btw, headers are not part of content. – spatar Dec 04 '13 at 17:49
  • If PHP can't access the size of the multipart boundary preamble or epilogue, then this can't be done as a pure-PHP solution by comparing the Content-Length request header to the received data. As @MeNa and I both suggested, you'll need an additional field or parameter of some kind, to track the size of the transmitted post data. – nightowl Dec 05 '13 at 00:47
0

maybe you can check with a valid variable, but not length, eg:

// client
$clientVars = array('var1' => 'val1', 'otherVar' => 'some value');
ksort($clientVars);  // dictionary sorted
$validVar = md5(implode('', $clientVars));
$values = 'var1=val1&otherVar=some value&validVar=' . $validVar;
httpRequest($url, values);

// server
$validVar = $_POST['validVar'];
unset($_POST['validVar']);
ksort($_POST);  // dictionary sorted
if (md5(implode('', $_POST)) == $validVar) {
    // completed POST, do something
} else {
    // not completed POST, log error and do something
}
andy.why
  • 57
  • 2
  • 7
0

I was also going to recommend using a hidden value, or hashing like MeNa mentions. (the issue there is that some algorithms are differently implemented over platforms, so your CRC32 in js might be different from a CRC32 in PHP. But with some testing you should be able to find a compatible one)

I'm going to suggest using symmetric encryption, just for the fact that it's an option. (I don't believe it's faster than hashing). Encryption offers, aside from confidentiality also integrity, ie. is this received message the one that was send.

Although streamciphers are very fast, blockciphers, like AES can be very fast as well, but this depends on your system, the languages you use etc. (also here, different implementations mean not all encryption is created equal)

If you can't decrypt the message (or it gives a garbled mess) than the message was incomplete.

But seriously, use hashing. hash the POST on the client, check the length first of the hash on the server. (some?) hashes are fixed length, so if the length doesn't match, it's wrong. Then hash the received POST and compare with the POST-hash. If you do it over the full POST, in a specified order (so any reordering is undone) the overhead is minimal.

All this, is assuming you just can't check the post message to see if fields are missing and is_set==True, length > 0 , !empty()...

puredevotion
  • 1,135
  • 1
  • 11
  • 27
0

I think what you are looking for is $HTTP_RAW_POST_DATA, this will give you the real POST length and then you can compare it to $_SERVER['CONTENT_LENGTH'].

  • 1
    Citing from [here](http://www.php.net/manual/en/ini.core.php#ini.always-populate-raw-post-data): `$HTTP_RAW_POST_DATA` is not available with `enctype="multipart/form-data"`. – spatar Dec 09 '13 at 23:48
0

I don't think it's possible to calculate original content size from $_REQUEST superglobal, at least for multipart/form-data requests.

I would add a custom header to your http request with all parameter=value hash, to be checked server side. Headers will arrive for sure so your hash header is always there. Be sure to join parameters in the same order, otherwise hash will be different. Also pay attention to encoding, must be the same on client and server.

If you can configure Apache, you could add a vhost with mod_proxy, configured to proxy on another vhost on the same server. This should filter uncomplete requests. Note that you're wasting 2 sockets per request this way, so keep an eye at resources usage if you think to go this way.

Ghigo
  • 2,312
  • 1
  • 18
  • 19
0

If computing the content length isn't reasonable, you could probably get away with signing the data sent by the client.

Using javascript, serialize the form data to a json string or equivalent in a reasonably sane manner (i.e. sort it as needed) before submitting. Hash this string using one or two reasonably fast algorithms (e.g. crc32, md5, sha1), and add this extra hash data to what is about to get sent as a signature.

On the server, strip this extra hash data from the $_POST request, and then redo the same work in PHP. Compare the hashes accordingly: nothing got lost in translation if the hashes match. (Use two hashes if you want to void the minuscule risk of getting false positives.)

I'd wager there's a reasonable means to do something similar for files, e.g. fetching their name and size in JS, and adding that additional information to the data that gets signed.

This is somewhat related to what some PHP frameworks do to avoid tampering with session data, when the latter gets managed and stored in client-side cookies, so you'll probably find some readily available code to do this in the latter context.


Original answer:

Insofar as I'm aware, the difference between sending a GET or a POST request more or less to amounts sending something like:

GET /script.php?var1=foo&var2=bar
headers

vs sending something like:

POST /script.php
headers

var1=foo&var2=bar              <— content length is the length of this chunk

So for each part, you could calculate the length and check that vs the length advertised by the content-length header.

  • $_FILES entries have a handy size field which you can use directly.
  • For $_POST data, rebuild the query string that was sent and compute its length.

Points to be wary about:

  1. You need to know how the data is expected to be sent in some cases, e.g. var[]=foo&var[]=baz vs var[0]=foo&var[1]=baz
  2. You're dealing with the C-string length rather than the multibyte length in the latter case. (Though, I wouldn't be surprised to learn that an odd browser behaves inconsistently here and there.)

Further reading:

Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • I am using `multipart/form-data`, not `application/x-www-form-urlencoded`. – spatar Dec 10 '13 at 17:09
  • Ya. To the best of my knowledge, the POST part will be formatted all the same (in between boundary separators), along with headers for each individual part. – Denis de Bernardy Dec 10 '13 at 17:18
  • I already commented for another answer about calculating `Content-Length` in PHP for a received request. You need to remember counting all multipart boundaries as a part of `Content-Length`. As well there might be preamble/epilogue of arbitrary size which you don't receive in PHP. – spatar Dec 10 '13 at 17:27
  • I'd have expected the boundary and the headers within it wouldn't be relevant in the size. But I guess this is subject to interpretation by browser vendors and servers. Aside: whichever way the length is computed, you may still have issues because of http://httpd.apache.org/docs/2.0/misc/known_client_problems.html#page-header – Denis de Bernardy Dec 10 '13 at 17:52
  • the idea of signing request or in simple words calculating hash and sending it alongside was mentioned in other answers at least 2 times already. – spatar Dec 10 '13 at 18:15
  • Hehe. I'm afraid I didn't take the time to actually read the other answers. ;-) Any particular reason for not having already accepted one of them? – Denis de Bernardy Dec 10 '13 at 18:37
  • I was waiting for a better solution but it seems that hash or additional last parameter is the way to go. – spatar Dec 10 '13 at 18:39
  • Additional parameter isn't such a great idea: You've basically no guarantee it's going to get sent in the order you're hoping for if jQuery or some other js library brutalizes the data a bit before it's sent. The hash, in contrast, will hold the same value irrespective of whether it's sent before or after other POST variables. – Denis de Bernardy Dec 10 '13 at 18:41
  • My client is a custom application, I control how POST request is made :) And I mentioned this as well in a comment to the question. – spatar Dec 10 '13 at 18:42
  • Ya, I use to think that way too for a number of things. And then one day I had to interact with a funky library, and shit hit the fan. ;-) – Denis de Bernardy Dec 10 '13 at 18:42
0

Some other solution that might be usefull... If the connection from the other side is slow, just remove the limit for executing the post.

set_time_limit(0);

And you`ll be sure that the hole post data will be sent.

ventsi.slav
  • 334
  • 1
  • 4
  • 14