0

I'm developing a chatbot for a school project, which will utilise a web service in the backend, intending to deploy it onto a third party cloud server host such as Heroku.

The web service will be doing periodic web scraping in realtime. I was developing with BeautifulSoup until I discover dynamically loaded content in the pages I need to scrape, so I've to switch to Selenium.

The problem is that Selenium requires a browser, but the cloud server doesn't have a GUI and probably doesn't allow installation of applications too.

So one solution I thought of is to use Chromium, a portable version of Chrome which doesn't need installation, in headless mode, which doesn't need a GUI.

I'm still a long way from figuring out how to deploy onto a cloud hosting server, let alone test my idea, so I thought to just seek professional input in advance. Will my web service be permitted by host servers to run in this manner?

thegreatjedi
  • 2,788
  • 4
  • 28
  • 49
  • You first need to tell us what kind of service do you have. A virtual machine or just a lambda/function service? – Sraw May 17 '19 at 05:34
  • Just a folder of some compiled Python scripts, possibly with Chromium and chromedriver inside. That'll be deployed onto the third party cloud server host where it'll run as a service, as far as I can understand how it's supposed to work. – thegreatjedi May 17 '19 at 05:38
  • Well, I can confirm that selenium is able to open a browser in headless mode. You just need to pass some startup parameters. But also, unfortunately, as far as I know, a headless browser sometimes doesn't have full functionalities. The best choice could be using xvfb to create a virtual GUI environment. – Sraw May 17 '19 at 05:57

0 Answers0