-1

I read in an old thread that the dockerized selenium grid is a resource hungry process.

I am trying to run 250 to 300 selenium tests in parallel, and after some research I found out I have 3 options:

1: multi threading 2: multi processing 3: running selenium script in docker container

But then I read that multi threading is not truly doing i/o in parallel?

So i moved my focus to the dockerized selenium script.

So how much resources will a simple dockerized selenium script consume? The selenium part of the script is really simple where it receives 3 to 5 values and then input these values on a web page, and click a button.

Is 24 gb ram with 4 cpu cores enough for above mentioned procedure?

Deepak Rai
  • 2,163
  • 3
  • 21
  • 36
Sam
  • 79
  • 1
  • 11

1 Answers1

1

If you're going to run everything at one host you won't get any profit from dockerizing.

The most consuming part here is web browser. Try to run 250-300 browser instances at the same time and you will get the answer.

Basically docker does not address parallelization issue. It addresses isolation and simplifies distribution and deployment. The most resource effective way from your list is multi-threading, however this requires to maintain your test code thread safe.

I would suggest you to make a test. How much your browsers will take depends on how heavy your UI is. If it loads a lot of data, it will take more RAM, if it runs a lot of javascript it will take more CPU. So start from 20 parallel sessions and watch your resources. Then increase if everything seems fine.

Alexey R.
  • 8,057
  • 2
  • 11
  • 27
  • So dockerized selenium won't benefit me in this scenario, I kinda had my hopes up, i am a bit confused about the multi threading though, I have been surfing the web like a zombie and I only read mixed answers, some sources say multi threading can work in parallel and other sources say it won't.... – Sam Feb 03 '21 at 18:50
  • Processes in docker will share the same hardware eventually like they would be sharing without docker. So if you have many computers you can deploy your cluster over them (using swarm or kubernetes) and thus get benefit. – Alexey R. Feb 03 '21 at 18:53
  • I ran selenium part without multi threading or anything else which is responsible for launching the browser and then entering the data into the required fields and clicking etc.... I ran 10 instances and all 10 consumed only 532 mb of ram while my cpu usage for all 10 was less then 17% ... It's something which I really liked since it's resource friendly, but the main issue is if 200 selenium scripts need to enter data on the web page will multi threading help selenium perform the task on all 200 browser instances simultaneously? Or will it complete one task first then move onto the other one? – Sam Feb 03 '21 at 18:58
  • Multiple computers isn't a solution for me regretfully – Sam Feb 03 '21 at 18:59
  • the idea of having grid is that you have parameterized test on one side and multiple "nodes" where the test can be actually executed. So you run test 1, test 2 and test 3 in parallel. All three tests create 3 parallel connection to `RemoteWebDriver`. So each operates with its own WebDriver instance. With that `RemoteWebDriver` you connect to grid hub that can dispatch the commands to appropriate node. At the node for each test the dedicated browser starts. So that having 300 parallel tests would require 300 parallel web browsers where each loads its own context, resources ==> – Alexey R. Feb 03 '21 at 19:11
  • ==> each takes its part of network channel capacity, each takes its part of RAM, and video memory, each takes CPU, etc. In your case you do not probably even need GRID because GRID is effective when you have several computers. By default grid node is limited by running 5 parallel sessions all because running browser context is very costly. – Alexey R. Feb 03 '21 at 19:14