7

Trying to create a web-front end for a Python3 backed application. The application will require bi-directional streaming which sounded like a good opportunity to look into websockets.

My first inclination was to use something already existing, and the example applications from mod-pywebsocket have proved valuable. Unfortunately their API doesn't appear to easily lend itself to extension, and it is Python2.

Looking around the blogosphere many people have written their own websocket server for earlier versions of the websocket protocol, most don't implement the security key hash so dont' work.

Reading RFC 6455 I decided to take a stab at it myself and came up with the following:

#!/usr/bin/env python3

"""
A partial implementation of RFC 6455
http://tools.ietf.org/pdf/rfc6455.pdf
Brian Thorne 2012
"""
  
import socket
import threading
import time
import base64
import hashlib

def calculate_websocket_hash(key):
    magic_websocket_string = b"258EAFA5-E914-47DA-95CA-C5AB0DC85B11"
    result_string = key + magic_websocket_string
    sha1_digest = hashlib.sha1(result_string).digest()
    response_data = base64.encodestring(sha1_digest)
    response_string = response_data.decode('utf8')
    return response_string

def is_bit_set(int_type, offset):
    mask = 1 << offset
    return not 0 == (int_type & mask)

def set_bit(int_type, offset):
    return int_type | (1 << offset)

def bytes_to_int(data):
    # note big-endian is the standard network byte order
    return int.from_bytes(data, byteorder='big')


def pack(data):
    """pack bytes for sending to client"""
    frame_head = bytearray(2)
    
    # set final fragment
    frame_head[0] = set_bit(frame_head[0], 7)
    
    # set opcode 1 = text
    frame_head[0] = set_bit(frame_head[0], 0)
    
    # payload length
    assert len(data) < 126, "haven't implemented that yet"
    frame_head[1] = len(data)
    
    # add data
    frame = frame_head + data.encode('utf-8')
    print(list(hex(b) for b in frame))
    return frame

def receive(s):
    """receive data from client"""
    
    # read the first two bytes
    frame_head = s.recv(2)
    
    # very first bit indicates if this is the final fragment
    print("final fragment: ", is_bit_set(frame_head[0], 7))
    
    # bits 4-7 are the opcode (0x01 -> text)
    print("opcode: ", frame_head[0] & 0x0f)
    
    # mask bit, from client will ALWAYS be 1
    assert is_bit_set(frame_head[1], 7)
    
    # length of payload
    # 7 bits, or 7 bits + 16 bits, or 7 bits + 64 bits
    payload_length = frame_head[1] & 0x7F
    if payload_length == 126:
        raw = s.recv(2)
        payload_length = bytes_to_int(raw)
    elif payload_length == 127:
        raw = s.recv(8)
        payload_length = bytes_to_int(raw)
    print('Payload is {} bytes'.format(payload_length))
    
    """masking key
    All frames sent from the client to the server are masked by a
    32-bit nounce value that is contained within the frame
    """
    masking_key = s.recv(4)
    print("mask: ", masking_key, bytes_to_int(masking_key))
    
    # finally get the payload data:
    masked_data_in = s.recv(payload_length)
    data = bytearray(payload_length)
    
    # The ith byte is the XOR of byte i of the data with
    # masking_key[i % 4]
    for i, b in enumerate(masked_data_in):
        data[i] = b ^ masking_key[i%4]

    return data

def handle(s):
    client_request = s.recv(4096)
    
    # get to the key
    for line in client_request.splitlines():
        if b'Sec-WebSocket-Key:' in line:
            key = line.split(b': ')[1]
            break
    response_string = calculate_websocket_hash(key)
    
    header = '''HTTP/1.1 101 Switching Protocols\r
Upgrade: websocket\r
Connection: Upgrade\r
Sec-WebSocket-Accept: {}\r
\r
'''.format(response_string)
    s.send(header.encode())
    
    # this works
    print(receive(s))
    
    # this doesn't
    s.send(pack('Hello'))
    
    s.close()

s = socket.socket( socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 9876))
s.listen(1)

while True:
    t,_ = s.accept()
    threading.Thread(target=handle, args = (t,)).start()

Using this basic test page (which works with mod-pywebsocket):

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Web Socket Example</title>
    <meta charset="UTF-8">
</head>
<body>
    <div id="serveroutput"></div>
    <form id="form">
        <input type="text" value="Hello World!" id="msg" />
        <input type="submit" value="Send" onclick="sendMsg()" />
    </form>
<script>
    var form = document.getElementById('form');
    var msg = document.getElementById('msg');
    var output = document.getElementById('serveroutput');
    var s = new WebSocket("ws://"+window.location.hostname+":9876");
    s.onopen = function(e) {
        console.log("opened");
        out('Connected.');
    }
    s.onclose = function(e) {
        console.log("closed");
        out('Connection closed.');
    }
    s.onmessage = function(e) {
        console.log("got: " + e.data);
        out(e.data);
    }
    form.onsubmit = function(e) {
        e.preventDefault();
        msg.value = '';
        window.scrollTop = window.scrollHeight;
    }
    function sendMsg() {
        s.send(msg.value);
    }
    function out(text) {
        var el = document.createElement('p');
        el.innerHTML = text;
        output.appendChild(el);
    }
    msg.focus();
</script>
</body>
</html>

This receives data and demasks it correctly, but I can't get the transmit path to work.

As a test to write "Hello" to the socket, the program above calculates the bytes to be written to the socket as:

['0x81', '0x5', '0x48', '0x65', '0x6c', '0x6c', '0x6f']

Which match the hex values given in section 5.7 of the RFC. Unfortunately the frame never shows up in Chrome's Developer Tools.

Any idea what I'm missing? Or a currently working Python3 websocket example?

Community
  • 1
  • 1
Hardbyte
  • 1,467
  • 13
  • 25
  • Tornado supports both websockets and Python 3. http://www.tornadoweb.org/documentation/websocket.html – Thomas K Oct 14 '12 at 15:18
  • Thanks Thomas. I'd like to have a standalone implementation first though - this is as much about understanding the protocol as solving a problem for me. Taking a look at the [tornado source code](https://github.com/facebook/tornado/blob/master/tornado/websocket.py) I see one header **Sec-WebSocket-Protocol** being sent from the server to the client, but the [spec](http://tools.ietf.org/html/rfc6455#section-4.2.2) says that is optional. – Hardbyte Oct 14 '12 at 22:11
  • If a client requests a sub-protocol, the server is expected to echo it (always assuming it supports the sub-protocol). Failure to do so would cause a handshake error so this probably isn't related to your message sending problems. – simonc Oct 15 '12 at 08:02
  • I can't see anything wrong in your code. Would it be worth using wireshark to confirm that the data being written out is the same as the data you're logging internally? And that nothing else is being written in between the handshake and start of your message. – simonc Oct 15 '12 at 15:12
  • Yeah the client didn't request a sub-protocol. But as @Phillip noted, I was sending extra whitespace after my handshake reply. – Hardbyte Oct 15 '12 at 20:41
  • If anyone else is interested I ended up making quite a few more subtle changes - code is tracked on bitbucket: https://bitbucket.org/hardbyte/python-socket-examples/src/tip/websocket.py – Hardbyte May 24 '13 at 08:46

1 Answers1

7

When I try talking to your python code from Safari 6.0.1 on Lion I get

Unexpected LF in Value at ...

in the Javascript console. I also get an IndexError exception from the Python code.

When I talk to your python code from Chrome Version 24.0.1290.1 dev on Lion I don't get any Javascript errors. In your javascript the onopen() and onclose() methods are called, but not the onmessage(). The python code doesn't throw any exceptions and appears to have receive message and sent it's response, i.e exactly the behavior your seeing.

Since Safari didn't like the trailing LF in your header I tried removing it, i.e

header = '''HTTP/1.1 101 Switching Protocols\r
Upgrade: websocket\r
Connection: Upgrade\r
Sec-WebSocket-Accept: {}\r
'''.format(response_string)

When I make this change Chrome is able to see your response message i.e

got: Hello

shows up in the javascript console.

Safari still doesn't work. Now it raise's a different issue when I attempt to send a message.

websocket.html:36 INVALID_STATE_ERR: DOM Exception 11: An attempt was made to use an object that is not, or is no longer, usable.

None of the javascript websocket event handlers ever fire and I'm still seeing the IndexError exception from python.

In conclusion. Your Python code wasn't working with Chrome because of an extra LF in your header response. There's still something else going on because the code the works with Chrome doesn't work with Safari.

Update

I've worked out the underlying issue and now have the example working in Safari and Chrome.

base64.encodestring() always adds a trailing \n to it's return. This is the source of the LF that Safari was complaining about.

call .strip() on the return value of calculate_websocket_hash and using your original header template works correctly on Safari and Chrome.

Phillip Dixon
  • 306
  • 3
  • 8