56
  • I could not find any answers related to the working mechanism of QR code scanning used on WhatsApp Web.
  • How does the authentication happen when the phone (any smartphone running WhatsApp) scans the QR code on the browser.
  • I don't want to know about the technology stack behind them. Like WhatsApp uses modified version of xmpp, uses erlang, uses web technologies like socket.io and ajax for the web version to implement such functionality.
  • The question might be broad. But I am eager to know about the implementation behind it.
Andrew T.
  • 4,701
  • 8
  • 43
  • 62

3 Answers3

111

It works like this :

1- You open the following URL on your browser : https://web.whatsapp.com/

2- The Browser loads the page with all sorts of JS and CSS stuff , but also opens a WebSocket ( wss://w4.web.whatsapp.com/ws ) - Check this image :

enter image description here

2.1- Every 20000 milliseconds you see traffic on the WebSocket for a refresh on the QR code you have on you screen. This is sent by the Server to the Browser, throught the WebSocket (WS we call it from now onwards)

enter image description here

2.2- On each QR Code refresh received on the WS , your browser does a GET request for the new QR Code in BASE64 encode .

2.3 - Notice that this specific WS that the server has open between the Server and the Browser is associated with the unique QR code !!! So, knowing the QR code, the server knows which WS is associated with it!

---- At this stage your browser is ready do the WhatsApp App work , but it does not know what is your ID (Whatsapp identifier which is your mobile number) , because it can't really get you phone number from thin air .

It also does not require you to type it, because the server wouldn't be sure that the number really belongs to you .

So, to let the Servers know that the WS session belongs to a specific phone, you need to use the phone for QR reading

3- You grab your phone, which is authenticated (otherwise you wouldn't have access to the section to scan QR codes) , and do the QR Code reading thing

4- When your mobile reads the QR code, it contacts the WhatsApp servers and tells them : My number is XXXX , My auth creds are YYYYY , and the WS associated with this QR code can now receive my DATA

5- The server now knows that it can direct Traffic to the specific WS socket that belongs to that QR Code, and does so !

6- On the Browser WS you can see the Server sending data regarding the user, regarding the conversations that you are having and which photo thumbnails to go and Grab.

enter image description here

7- The Browser gets this data from the WebSocket , and makes the corresponding GET requests to get the Thumbs, and other resources it needs, like an MP3 for notifications

7.1 - The WS listener on the Browser also makes Javascript calls, on the javascript files that were received at step 1 , to redraw the page DOM with the new interface .

8- The interface is now redraw to look like the WhatsApp app , and you continue to receive data on the WS , and sending when needed, and updates are made to the interface as data is arriving on the WS .

That is it.

Using Chrome, and Developer tools , you can see all this happening live. You can also see the WS communication (most of it, the binary frames you would need another tool ) and see what is happening all steps of the way.

Also:

SysHex
  • 1,750
  • 2
  • 17
  • 27
  • But, your phone also needs to be connected to the internet all the time to use whatsapp web. Where is that dependency used? Like when you say you continue to receive data on the WS, but if you disconnect your phone, you don't receive any data on the WS. – nkalra0123 Oct 03 '18 at 07:01
  • 4
    That is correct. I just didn't think that bit was relevant to the question, but it is fairly easy. The Whatapp APP keeps sending heartbeat to the server, a keep alive ping of sorts, so if you turn off the phone that keep alive ping stops, when it stops, the server stops delivering to the phone and the WS. The WS can stay alive, just no data is received there. Something of this sort. The question was "Mechanism behind QR code scanning of whatsapp webapp" which is what described , not the complete functioning of a whatapp mobile app and web app. – SysHex Oct 04 '18 at 08:47
  • would you please elaborate on points 5 and 6 about how the web browser can be authenticated after getting information through a websocket? Thank you. – Đức Thanh Nguyễn Jun 11 '21 at 14:31
  • @ĐứcThanhNguyễn I don't understand your question. There is nothing to elaborate. The Server creates a WS and a QR Code . That is a Pair , if you know the QR code, you know the WS associated. If a phone takes a picture of a QR Code, and send to the server : I took the picture of this QR Code, then you know what WS you should associate with that Phone, ..... because the QR code is the same. Also, I've provided both code and a complete tutorial explaining this in more detail. – SysHex Jun 23 '21 at 10:31
5

It uses something like below.

  1. Whatsapp web application is opened by user via web browser.
  2. Server creates a UNIQUE token (number) and embeds that number in QR-Code
  3. Whatsapp phone application reads QR-Code and decodes token.
  4. Whatsapp phone application sends information about its current user and this newly read token to whatsapp server.
  5. Whatsapp server matches token (+ phone app user information) with web browser.
  6. It automatically authenticates user and open new web page with his/her information on it.
Atilla Ozgur
  • 14,339
  • 3
  • 49
  • 69
  • For Example if thousands of tokens are generated by the server when each user opens the web.whatsapp.com.As HTTP are stateless.How does the server initiate and find that particular user's browser window associated with that particular token. –  Aug 30 '16 at 15:44
  • @dewnor how stack overflow differentiate between you and me. This is similar, we each have different tokens. – Atilla Ozgur Aug 30 '16 at 17:27
  • 1
    That is true When the user directly interacts with the browser (form loads -> user enters credentials -> presses login button -> gets redirected to dashboard). But here only QR code gets rendered then interaction happens between the phone and server.Once authenticated on server without any user interaction with browser.The browser is redirected to dashboard. –  Aug 31 '16 at 16:04
  • wrt above comment please elaborate point 5 and 6 of your answer –  Aug 31 '16 at 16:42
2

there are two ways to implement QR login like whatsapp

  1. Ajax polling
  2. Websocket

I've made demos in php of both

Note: Websocket apporach requires 2 port, one for main app and other for listening websocket connection. Http server and websocket server can run on same port too with some proxy or some other way.

I found an example in nodejs too QR login Websocket with nodejs

Sahil Kashyap
  • 329
  • 2
  • 10