4

enter image description here

I'm using Selenium for web crawler, it works fine at most time, but some websites can detect it, so I decided to learn more deeply.

After some search, i found "Chrome DevTools Protocol" and "Json Wire Protocol".

"Json Wire Protocol" works between Selenium and Webdriver as implementations for Python Java C# and other languages, so they can communicate with Webdriver by unified protocol, several articles explained this point.

But I can't find any article about how Webdriver communicate with browser, a few article says ChromeDriver communicates with Chrome by "Chrome DevTools Protocol", but they didn't explain details, so I'm not sure is this point correct.

How Browser receives command from Browser Driver and executes it?

vitaliis
  • 4,082
  • 5
  • 18
  • 40
Fire0594
  • 133
  • 1
  • 8

4 Answers4

4

WebDriver basically lets you define driver object for 7 known browsers.

Webdriver is an interface and RemoteWebDriver is class that implements Webdriver Interface.

All mentioned 7 classes (chromedriver, safaridriver, edgeDriver and so) extend RemoteWebDriver class.

Below is the communication flow between webdriver and broswer :

  • for each Selenium command, a HTTP request is created and sent to the browser driver

  • the browser driver uses a HTTP server for getting the HTTP requests

  • the HTTP server determines the steps needed for implementing the Selenium command

  • the implementation steps are executed on the browser

  • the execution status is sent back to the HTTP server

  • the HTTP server sends the status back to the automation script

Read more about it here

enter image description here

vitaliis
  • 4,082
  • 5
  • 18
  • 40
cruisepandey
  • 28,520
  • 6
  • 20
  • 38
  • Thank you for quick answer, in communication flow the 4th step: "the implementation steps are executed on the browser", how it works ? is it trans selenium command into JS and run on browser by "Chrome DevTools Protocol" or by some other protocol ? – Fire0594 Jun 02 '21 at 10:14
  • whenever you write any selenium command i.e `.click()`, internally a HTTP request is created and sent to browser driver and then send to HTTP server and now it's HTTP server that basically understood the command and implements or (perform) steps (in this case click() ) on the UI. and in similar manner based on action performed on the UI , status code would be returned and sent back to Http sever and from there it would sent to our automation code which we can see in the console. The protocol is JSON wire protocol throughout this flow – cruisepandey Jun 02 '21 at 10:20
  • I have attached a pic from my book, I hope it will be easy for you to understand – cruisepandey Jun 02 '21 at 10:26
  • In picture there is a http server between Browser Driver and Browser, it send "execution of command" to browser, how does this step works? – Fire0594 Jun 02 '21 at 11:02
  • HTTP server would have got HTTP request - on form of URL (Internally even HTTP request is converted into URL ) it is a JSON wire protocol that creates the HTTP request. Now Once HTTP server received the request, it needs to implement it, and so it implements on browser . Now if you ask how it implements it, is quite something like how complier execute code once it receives object class of any file. – cruisepandey Jun 02 '21 at 12:35
  • Brilliant. Explains why webdriver.Chrome('/Users/.../chromedriver2',options=options) must be repeated on an except branch of a python try-except command. That feedback loop modifies the webdriver – Al Martins Jul 01 '21 at 08:28
2

On a high level selenium webdriver interacts with browser and it will not translate to Javascripts command, Basically our Java or Python Code will be sent as an api get and post request in JSON wire protocol. as explained in the above answer browser webdriver interacts with the real browser as a HTTP Request.

Every Browser Driver uses an HTTP server to receive HTTP requests. Once the URL reaches the Browser Driver, then it will pass that request to the real browser over HTTP. Once done, the commands in your Selenium script will be executed on the browser.

If the request is POST request, then there will be an action on the browser. If the request is a GET request then the corresponding response will be generated at the browser end. It will be then sent over HTTP to the browser driver and the Browser Driver over JSON Wire Protocol and sends it to the UI.

  • Test commands are converted into an HTTP request by the JSON wire protocol
  • Before executing any test cases, every browser has its own driver which initializes the server.
  • The browser then starts receiving the request through its driver.

for more information please refer the below site

BrowserStack :- https://www.browserstack.com/guide/selenium-webdriver-tutorial

Selenium Dev:- https://www.selenium.dev/documentation/en/webdriver/understanding_the_components/

Edureka:- https://www.edureka.co/blog/selenium-webdriver-architecture/

LearnerLaksh
  • 60
  • 1
  • 6
0

Two popular facilities used by web automated testing framework are WebDriver and DevTools. Before going in detail let us take a look some of the entities we will be talking.

  • WebDriver
  • WebDriver protocol
  • DevTools
  • DevTools Protocol
  • Chrome DevTools Protocol

WebDriver is a browser specific module (kind of a remote agent) that we (or testing libraries such as Selenium etc) can interact with it by using WebDriver protocol (it is a language-neutral wire protocol). The instructions we given to WebDriver will be delivered to it's browser by using its own proprietary communication mechanism. The WebDriver protocol consist of set of commands that abstract away common interactions with an application such as navigating, clicking, or reading the state of an element etc. Since it is a web standard, it is well supported across all major browser vendors. The WebDriver protocol is organized into commands; each HTTP request with a method and template defined in this specification represents a single command, and therefore each command produces a single HTTP response.

The DevTools is proprietary browser native interface that is being used to debug the browser from a remote application. The Chrome DevTools is from Chromium and the protocol used in this case is Chrome DevTools Protocol. Nowadays more and more browsers are adopting the Chrome DevTools Protocol to interact with their browser DevTools. The DevTools approach offers more automation capabilities to the automation tools than using WebDriver. We (or testing framework) could use this facility to interact directly (without any proxy or agent) with the web browser and Chrome DevTools is using WebSockets for this channel of communication. Then it is a bidirectional persistent connection between our testing framework module and the web browser.

Let us take a look on output snippet produced by the following Selenium node.js source code. We could see that it is using ws://127.0.0.1:51315, a WebSockets port 51315 on local host (by DevTools).

DevTools listening on 
ws://127.0.0.1:51315/devtools/browser/1c6fd882-e20d-44e9-9a86-bebcf93a514e
var webdriver = require('selenium-webdriver');

var driver = new webdriver.Builder().
      withCapabilities(webdriver.Capabilities.chrome()).
      build();
driver.get('https://www.amazon.com/');
driver.quit();
Satyan
  • 1,346
  • 8
  • 15
0

How a BrowserDriver communicates with its corresponding Browser is browser specific. For example, for earlier versions of Chrome, an automation extension was used to facilitate communications between the ChromeDriver and Chrome browser, and later, the Chrome DevTools API is used for this purpose.

Here are some links for references:

How does chrome driver interact with Chrome browser? https://www.pawangaria.com/post/automation/how-chromedriver-works-in-background/ https://sahajamit.medium.com/selenium-chrome-dev-tools-makes-a-perfect-browser-automation-recipe-c35c7f6a2360

Sun Rui
  • 166
  • 2