What is the mask in a WebSocket frame?

Question

I am working on a websocket implementation and do not know what the sense of a mask is in a frame.

Could somebody explain me what it does and why it is recommend?

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-------+-+-------------+-------------------------------+
 |F|R|R|R| opcode|M| Payload len |    Extended payload length    |
 |I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
 |N|V|V|V|       |S|             |   (if payload len==126/127)   |
 | |1|2|3|       |K|             |                               |
 +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
 |     Extended payload length continued, if payload len == 127  |
 + - - - - - - - - - - - - - - - +-------------------------------+
 |                               |Masking-key, if MASK set to 1  |
 +-------------------------------+-------------------------------+
 | Masking-key (continued)       |          Payload Data         |
 +-------------------------------- - - - - - - - - - - - - - - - +
 :                     Payload Data continued ...                :
 + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
 |                     Payload Data continued ...                |
 +---------------------------------------------------------------+

score 85 · Accepted Answer · edited Oct 07 '21 at 05:46

85

Websockets are defined in RFC6455, which states in Section 5.3:

The unpredictability of the masking key is essential to prevent authors of malicious applications from selecting the bytes that appear on the wire.

In a blog entry about Websockets I found the following explanation:

masking-key (32 bits): if the mask bit is set (and trust me, it is if you write for the server side) you can read for unsigned bytes here which are used to xor the payload with. It's used to ensure that shitty proxies cannot be abused by attackers from the client side.

But the most clearly answer I found in an mailing list archive. There John Tamplin states:

Basically, WebSockets is unique in that you need to protect the network infrastructure, even if you have hostile code running in the client, full hostile control of the server, and the only piece you can trust is the client browser. By having the browser generate a random mask for each frame, the hostile client code cannot choose the byte patterns that appear on the wire and use that to attack vulnerable network infrastructure.

As kmkaplan stated, the attack vector is described in Section 10.3 of the RFC.
This is a measure to prevent proxy cache poisoning attacks¹. What it does, is creating some randomness. You have to XOR the payload with the random masking-key.

By the way: It isn't just recommended. It is obligatory.

1: See Huang, Lin-Shung, et al. "Talking to yourself for fun and profit." Proceedings of W2SP (2011)

edited Oct 07 '21 at 05:46

Community

1
1

answered Jan 05 '13 at 17:30

Enno Gröper

4,391
1
27
33

6

thanks for the great answer. But couldn't a abused proxy still unmask a frame? – bodokaiser Jan 05 '13 at 17:34
4

The explanation itself *is* in RFC 6455 section 10.3. Attacks On Infrastructure (Masking) – kmkaplan Jan 05 '13 at 17:35
@kyogron the vulnerability is *not* proxy unmasking a frame but proxy cache being poisoned by bad data. – kmkaplan Jan 05 '13 at 17:35
1

@Kyo. Yes, but that's not the point. This is not an encryption algorithm, it is just preventing attackers from having arbitrary control over what bytes the proxy sees. A buggy proxy might crash or worse if it sees particular bytes. – David Grayson Jan 05 '13 at 17:38
10

I still have absolutely no idea what this masking technique is trying to prevent. If I knew about a proxy being vulnerable to a certain byte sequence I could just implement my own TCP server and client to exploit that. – pwuertz Dec 03 '13 at 14:58
6

@pwuertz: but you can't cause that to happen on a hapless user's machine by including malicious JavaScript on a page, can you? – Asherah Mar 21 '14 at 03:52
1

@YukiIzumi is right. This would prevent a distributed attack to happen. If someone could deface a website in a way that the only change were some js code to attack some proxy, you would have it. By making the browser to use this random mask and changing the bytes sent, the only that could attack it that way would be only one with its custom client software. – Adrián Pérez Oct 02 '14 at 17:01
4

This is kind of ridiculous -- any "real" hacker wouldn't try to execute a cache poisoning attack by writing Javascript code that ran in a browser. It would be pretty trivial to take a non-browser WebSocket library and modify it to use a fixed masking key, thereby enabling your code to send whatever bytes you want over the wire (just pre-XOR you bytes with the masking key, then the libary will XOR it right back to whatever you want on the wire). Although any "real" proxy should realize that the connection uses WebSocket frames, and shouldn't be trying to parse the data as anything else. – Luke Hutchison Jan 01 '15 at 20:09
1

It is also handy for preventing CBC attacks like BEAST. – Yuhong Bao Jul 03 '15 at 00:43
@LukeHutchison The attacker wouldn't be able to run their modified socket code on the user's computer, and it is assumed that the vulnerable proxy is only reachable from the user's computer and not open to the internet. Getting the user to visit an arbitrary webpage in their browser is trivial though. – Bergi Jul 24 '23 at 12:34

Martin Konecny · Answer 2 · 2017-07-10T20:21:02.513

From this article:

Masking of WebSocket traffic from client to server is required because of the unlikely chance that malicious code could cause some broken proxies to do the wrong thing and use this as an attack of some kind. Nobody has proved that this could actually happen, but since the fact that it could happen was reason enough for browser vendors to get twitchy, masking was added to remove the possibility of it being used as an attack.

So assuming attackers were able to compromise both the JavaScript code executed in a browser as well as the the backend server, masking is designed to prevent the the sequence of bytes sent between these two endpoints being crafted in a special way that could disrupt any broken proxies between these two endpoints (by broken this means proxies that might attempt to interpret a websocket stream as HTTP when in fact they shouldn't).

The browser (and not the JavaScript code in the browser) has the final say on the randomly generated mask used to send the message which is why it's impossible for the attackers to know what the final stream of bytes the proxy might see will be.

Note that the mask is redundant if your WebSocket stream is encrypted (as it should be). Article from the author of Python's Flask:

Why is there masking at all? Because apparently there is enough broken infrastructure out there that lets the upgrade header go through and then handles the rest of the connection as a second HTTP request which it then stuffs into the cache. I have no words for this. In any case, the defense against that is basically a strong 32bit random number as masking key. Or you know… use TLS and don't use shitty proxies.

score 14 · Answer 3 · answered Nov 12 '20 at 17:36

I have struggled to understand the purpose of the WebSocket mask until I encountered the following two resources which summarize it clearly.

From the book High Performance Browser Networking:

The payload of all client-initiated frames is masked using the value specified in the frame header: this prevents malicious scripts executing on the client from performing a cache poisoning attack against intermediaries that may not understand the WebSocket protocol.

Since the WebSocket protocol is not always understood by intermediaries (e.g. transparent proxies), a malicious script can take advantage of it and create traffic that causes cache poisoning in these intermediaries.

But how?

The article Talking to Yourself for Fun and Profit (http://www.adambarth.com/papers/2011/huang-chen-barth-rescorla-jackson.pdf) further explains how a cache poisoning attack works:

The attacker’s Java applet opens a raw socket connection to attacker.com:80 (as before, the attacker can also a SWF to mount a similar attack by hosting an appropriate policy file to authorize this request).

The attacker’s Java applet sends a sequence of bytes over the socket crafted with a forged Host header as follows: GET /script.js HTTP/1.1 Host: target.com

The transparent proxy treats the sequence of bytes as an HTTP request and routes the request based on the original destination IP, that is to the attacker’s server.

The attacker’s server replies with malicious script file with an HTTP Expires header far in the future (to instruct the proxy to cache the response for as long as possible).

Because the proxy caches based on the Host header, the proxy stores the malicious script file in its cache as http://target.com/script.js, not as http://attacker.com/script.js.

In the future, whenever any client requests http://target.com/script.js via the proxy, the proxy will serve the cached copy of the malicious script.

The article also further explains how WebSockets come into the picture in a cache-poisoning attack:

Consider an intermediary examining packets exchanged between the browser and the attacker’s server. As above, the client requests WebSockets and the server agrees. At this point, the client can send any traffic it wants on the channel. Unfortunately, the intermediary does not know about WebSockets, so the initial WebSockets handshake just looks like a standard HTTP request/response pair, with the request being terminated, as usual, by an empty line. Thus, the client program can inject new data which looks like an HTTP request and the proxy may treat it as such. So, for instance, he might inject the following sequence of bytes: GET /sensitive-document HTTP/1.1 Host: target.com

When the intermediary examines these bytes, it might conclude that these bytes represent a second HTTP request over the same socket. If the intermediary is a transparent proxy, the intermediary might route the request or cache the response according to the forged Host header.

In the above example, the malicious script took advantage of the WebSocket not being understood by the intermediary and "poisoned" its cache. Next time someone asks for sensitive-document from target.com they will receive the attacker's version of it. Imagine the scale of the attack if that document is for google-analytics.

To conclude, by forcing a mask on the payload, this poisoning won't be possible. The intermediary's cache entry will be different every time.

What is the mask in a WebSocket frame?

3 Answers3

Linked