-2

I am currently using Selenium to open and retrieve the page source of a list of urls. However Selenium is taking way too long for each url and I plan on using this script for (atleast) a couple of hundred urls. Can anyone suggest a faster method of getting the page source given a url (using php maybe?)?

Please include the code of your suggestion. Thanks in advance.

browser.get(url)
body = browser.page_source
petezurich
  • 9,280
  • 9
  • 43
  • 57
Mohamad Moustafa
  • 479
  • 5
  • 19

1 Answers1

0

I'm a noob.

But I think requests might be faster, followed by headless browsers (Selenium, but doesn't open a GUI) and then finally, regular Selenium. I'm basing my thinking on the resources each method might use.

Unfortunately I can't find any articles that time the difference between these methods, but here's an article that seems to cover the chrome headless browser: https://intoli.com/blog/running-selenium-with-headless-chrome/

Lafftar
  • 145
  • 3
  • 11