4

I'm running a high traffic ssl website with apache/mod_wsgi/python. Very occasionally (around 10x in 3 months) I've seen some extra garbage characters in post data.

Usually it's been in the form of a extra char at the end.

('access.uid', 'allow\xba')
('checksum', 'b219d6a006ebd95691d0d7b468a94510496c5dd8\xff')

Once though it was in the middle of someone's password. Something like:

('login_password', 'samplepass\xe7word')

I've tried to reconstruct the request with all the same headers but I haven't been able to duplicate the error. Anyone have any ideas about what could be causing this or any ideas of how I could go about reproducing and fixing this problem?

(Copied from below):
I'm using apache-2.2.17_1 – Peter Mar 15 at 18:09 I'm using mod_wsgi-3.3_1 on one machine and mod_wsgi-2.8_1 on another. I've seen this error on both.

cwallenpoole
  • 79,954
  • 26
  • 128
  • 166
Peter
  • 41
  • 3
  • Any pattern with the user agents? – Brian Goldman Mar 15 '11 at 17:51
  • They were all IE 7 or 8 on windows XP or 7. But I'm not sure that indicates anything since it's such a small sample size. – Peter Mar 15 '11 at 18:00
  • Bounty cause I think this is interesting. – Brian Goldman Mar 17 '11 at 18:44
  • 3
    Just a though, but is it possible these are bots scanning for known exploits? – Martin Tournoij Mar 17 '11 at 19:24
  • Perhaps it only happens when the server is under heavy load? In which case, you could try to reproduce it by hammering a test server with lots of fake requests and have the server simply log everything under the sun. Then you could look for these anomalies in the log output... – Cameron Mar 17 '11 at 23:02
  • Oh, and is it always an *extra* character, or is it sometimes a mutated/missing character too? – Cameron Mar 17 '11 at 23:04
  • The is no load balancing system. Uses DNS round-robin. – Peter Mar 18 '11 at 17:41
  • On all the errors I've seen it's always been an extra char. – Peter Mar 18 '11 at 17:42
  • @Carpetsmoker: It could be but I doubt it because everything else about the requests seems legit tokens check out no missing vars etc.. – Peter Mar 18 '11 at 17:49
  • @Cameron: Tried what you suggested using a script I hammered away at a test server for over 3hrs. Got script timeout errors but not the problem described above. – Peter Mar 18 '11 at 21:58
  • @Peter: Oh well, it was worth a shot. Was the server configured the same as your production one (i.e. SSL requests, OS version, etc.)? – Cameron Mar 18 '11 at 23:54
  • I had similar problems with POST data from AJAX request I can't be sure that there is trash characters. I have writted my problem here http://superuser.com/questions/201923/random-http-400-errors – Xavier Combelle Mar 22 '11 at 20:09
  • Out of curiosity, what are you using for form validation? – asthasr Mar 24 '11 at 16:57
  • Not sure what you mean exactly. The POSTs from logged-in users have a unique token, the only other posts accepted are login attempts that are valid if the credentials are valid. – Peter Mar 24 '11 at 18:32

4 Answers4

2

What version of Apache are you using? From memory, somewhere around Apache 2.2.12-2.2.15 there were various SSL fixes. You might want to ensure you are using Apache 2.2.15 or later.

Graham Dumpleton
  • 57,726
  • 6
  • 119
  • 134
  • For mod_wsgi I'm using mod_wsgi-3.3_1 on one machine and mod_wsgi-2.8_1 on another. I've seen this error on both. – Peter Mar 15 '11 at 18:19
0

what happens if you print eval("u'%s'"%garbled_text)? does the output look likely (I understand that you may not be able to post sensitive data)

It looks to me like somewhere it's assuming you're reading ASCII even though you've told it to read utf-8.

Can we see the code that reads this POST data into python, or where it is specified and from what input form?

theheadofabroom
  • 20,639
  • 5
  • 33
  • 65
  • The versions that i've seen this error on are: apache-2.2.17_1, mod_wsgi-3.3_1 on one machine and mod_wsgi-2.8_1, python27-2.7.1_1 on one machine and python26-2.6.6_1. – Peter Mar 18 '11 at 17:43
  • I'm using Ian Bicking's webob (v.1.0.3) to read the post data. The code at the version I'm using is available here: https://bitbucket.org/ianb/webob/src/e3818f47af70/webob/multidict.py https://bitbucket.org/ianb/webob/src/e3818f47af70/webob/request.py The part of my code where this error happens is pretty basic `req = app.Request(environ)` `for (key, value) in req.POST:` – Peter Mar 18 '11 at 17:43
  • Because I'm not able to duplicate this myself so I can only go by the errors I've seen. But take the checksum above for example. The checksum is generated like so: `hashlib.sha1(str).hexdigest()` That should always be a 40 char ASCII string, and on the webpage that appears in the hidden field properly with no extra \xff in it. Same with the access.uid, that's a hard coded ASCII string in the form. Of course someone could POST junk data but everything else about these requests that are failing seems legit, tokens check out no missing vars etc.., so i don't think that is what is happening here. – Peter Mar 18 '11 at 17:46
0

Since you said all errors occurred in IE 7 or 8 I'm starting to suspect the error occurs client-side in the browser. I've never heard of anything like this error and I have no clue what otherwise could cause it server-side except for hardware failure (though that seems weird too since only one character is added). Perhaps you should suggest your users to upgrade to a decent browser?

orlp
  • 112,504
  • 36
  • 218
  • 315
0

This looks very much like chunked HTTP/1.1.

Use an appropriate handler to un-chunk it prior to parsing. See [1], [2].

Another option is to only accept HTTP/1.0 which doesn't have chunking at all, but this may have downsides.

Community
  • 1
  • 1
9000
  • 39,899
  • 9
  • 66
  • 104
  • It doesn't look like it to me. The format of chunked encoding is the chunk size in ASCII hexadecimal, CRLF, then the chunk. Peter is getting is only one byte and it's not necessarily in the ASCII character set. – Brian Goldman Mar 23 '11 at 15:41
  • 2
    If it was a chunked request then Apache/mod_wsgi would have rejected the request. WSGI doesn't support chunked requests. There is a way of having mod_wsgi allow you to step outside of WSGI to handle chunked request content, but has to be explicitly enabled via a directive. Either way, Apache itself removes the chunking and should never be seen by the application. – Graham Dumpleton Mar 23 '11 at 21:17