0

I have a python wsgi application running under apache2 via mod_wsgi 3.3, on Ubuntu 12.04.4 LTS.

class MyGateway(object):
    # ...
    def __call__(self, environ, start_response):
        environ['gateway.errors'] = []
        return self.gateway(environ, start_response)
    # ...

if __name__ == '__main__':
    # Code to run outside of apache/mod_wsgi
else:
    g = MyGateway()
    application = g

It has been running fine for more than half a year, with no interaction required and no changes to the code or the configuration. This morning, tech support started getting calls that it was not working. When I looked into it, I found that the expected results were getting extra characters inserted. For example, if incorrect credentials were used, the correct body of the server response would be:

<b>Incorrect username or password.</b>

Instead, I am getting:

1
<
1
b
1
>
1
I
1
n
1
c
1
o
1
r
1
r
1
e
1
c
1
t
1

1
u
1
s
1
e
1
r
1
n
1
a
1
m
1
e
1

1
o
1
r
1

1
p
1
a
1
s
1
s
1
w
1
o
1
r
1
d
1
.
1
<
1
/
1
b
1
>
0

(I would have loved to collapse that about 12 lines down, but the powers that be declined that feature request.)

We tried:

  • Restarting the server
  • Installing updates to the OS and rebooting
  • Restoring from a backup from a few weeks ago
  • Reverting to a a full machine backup from a couple of months ago

None of these resolved the issue. I added logging to the python code to see whether the correct values were being returned by the python script (which would also help determine whether an external dependency was broken), and found that right up until the point where the __call__ method returns, I have the correct value:

def __call__(self, environ, start_response):
    environ['gateway.errors'] = []
    response = self.gateway(environ, start_response)
    log.write('response to be returned: <%s>' % (response), log.DEBUG)
    # The above results in
    # "response to be returned: <<b>Incorrect username or password.</b>>" in the
    # log file for invalid credentials.
    return response

I found this SO question with similar symptoms, and the resolution was to upgrade mod_wsgi. I looked into doing so, but apt-get tells me I am on the latest version, so short of compiling it myself, that will require upgrading to Ubuntu 14.x.

So, I my question is threefold:

  1. Does anyone have better ideas as to what the likely cause of this corrupted server response is?
  2. If the restored machine exhibits the same behavior, but there were no issues prior to today, that would suggest some other machine has the problem (using Occam's razor to filter out other possibilities). However, no other network traffic is having problems with garbage data. What should I try to rule out next?
  3. The accepted answer and its comments for this serverfault question imply that I will need to install yet another module to enabling logging of the response in apache and yet that still might not get me the response body (the not-accepted answer that came 9 months later indicates it could work, but I see no response from anyone saying it does). Does anyone have more information about how to get the resposne body logged? Another possibility would be to install Wireshark on Ubuntu (I've only used it on Windows thus far) to see if the response is leaving the machine correctly but getting corrupted elsewhere, but is there a more common tool people use on Ubuntu, or a simpler way to check the response internal to the machine?
Community
  • 1
  • 1
hlongmore
  • 1,603
  • 24
  • 28

1 Answers1

1

Two things.

The first is that it looks like you are returning a string from your WSGI application instead of an iterable (e.g. list) of strings. This is resulting in each single character being sent one at at time, which is absolutely dreadful for performance. So don't return a string, but a list containing a single string.

The second is that in combination with returning a string with each character being sent one at a time, you have no content length in the response headers. As a result, Apache is using chunked encoding for the response content. The 1's in the output are actually part of the chunked request encoding, which suggests that whatever client you are using is not dealing with chunked encoding properly. So ensure you also set a response content length.

Graham Dumpleton
  • 57,726
  • 6
  • 119
  • 134
  • Thanks for the info regarding using a list versus a string. The server created by wsgiref.simple_server.make_server generated the content-length header for me, so I hadn't previously added code to set it. Now I understand that mod_wsgi doesn't, and have added it. I am still perplexed as to what made this stop working as it was working before. It seems this [serverfault answer](http://serverfault.com/a/59087) could help explain it, but even that doesn't tell me what changed. – hlongmore Aug 05 '14 at 00:21
  • Apache/mod_wsgi doesn't set a content length if none is supplied and it technically might be calculated as the WSGI specification defines as the WSGI specification is technically broken in saying you can. Detailed information as to why in http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html from memory. As to why you only saw it change now, it could also be in part be dictated by what HTTP version your HTTP client said it supported. A client which only supported older HTTP version wouldn't trigger chunked response. – Graham Dumpleton Aug 05 '14 at 01:05
  • I guess I've been asking the wrong question. Instead of asking "what changed?" I should be asking, "How did this work in the first place?" Thanks for your help! – hlongmore Aug 05 '14 at 20:13