0

A very small part of my code base needs to automatically control a browser. I've tried using Requests and BeautifulSoup - which I've used successfully on other projects - but I'm missing something and getting junk back from the server.

I'm open to using Selenium for what I need (or some other solution). I've managed to get it working but I've found it requires manual download of a driver (I think). For other people that use my library I'd like something that can be installed automatically or some default option that doesn't require installation.

My question is thus, is there an option for remote controlling a Windows browser that just works with a pip call or two? I'd prefer a Selenium solution but it appears that all browsers in Selenium now require an additional manual download. Is that correct or am I missing something?

EDIT: I added automating driver installation as an issue to the Selenium repo: https://github.com/SeleniumHQ/selenium/issues/7922

Jimbo
  • 2,886
  • 2
  • 29
  • 45
  • 1
    Have you looked at [tag:webdriver-manager] – undetected Selenium Jan 05 '20 at 21:03
  • I had not seen that, thanks. For what I need - just 3 pages automated - some default browser would be preferable, but perhaps that option just doesn't exist. I could also see updates directly to Selenium where it basically does what webdriver-manager does - assuming that functionality doesn't already exist in Selenium. – Jimbo Jan 05 '20 at 21:14
  • 1
    Did you checked [this](https://stackoverflow.com/questions/49824109/how-to-update-chrome-driver-for-windows-10). check the powershelll option in this question. – supputuri Jan 06 '20 at 03:51
  • @supputuri Thanks for the input. The goal is to have a library that just works via normal python mechanisms (pip, conda, etc.) – Jimbo Jan 07 '20 at 03:09
  • I am sure we can install libraries with pip, but in this case you have to download a zip and then extract the .exe out of the zip in the desired location. I can post the logic that I developed to achieve this in Ruby which might give you an idea to you. – supputuri Jan 07 '20 at 03:34
  • Added the python download solution. Please check and let me know, your thoughts. – supputuri Jan 07 '20 at 05:23
  • I still think the ideal solution is to just have this functionality be a part of Selenium .... I added an issue that is obviously low priority but maybe one day ... https://github.com/SeleniumHQ/selenium/issues/7922 – Jimbo Mar 24 '20 at 23:42

1 Answers1

1

PYTHON

Edit1:

You can use the webdriver_manager to handle this scenario, where it will take care of os.chomp too. Here is the pip installation for webdriver_manager

pip install webdriver-manager

And below is the sample script for chrome.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get("https://www.google.com")
driver.quit

Old Answer:

============================================================ This will make sure the chromedriver is always latest stable version and you don't have to do any manual steps

The other advantage with webdriver_manager is you can download any driver on fly. Below is the simple example for Firefox (GeckoDriver).

from selenium import webdriver
from webdriver_manager.firefox import GeckoDriverManager

driver = webdriver.Firefox(executable_path =GeckoDriverManager().install())
driver.get("https://www.google.com")
driver.quit()

Here is the dirty code to download the latest chromedriver dynamically for windows.

import requests
import wget
import zipfile
import os

# get the latest chrome driver version number
url = 'https://chromedriver.storage.googleapis.com/LATEST_RELEASE'
response = requests.get(url)
version_number = response.text

# build the donwload url
download_url = "https://chromedriver.storage.googleapis.com/" + version_number +"/chromedriver_win32.zip"

# download the zip file using the url built above
latest_driver_zip = wget.download(download_url,'chromedriver.zip')

# extract the zip file
with zipfile.ZipFile(latest_driver_zip, 'r') as zip_ref:
    zip_ref.extractall() # you can specify the destination folder path here
# delete the zip file downloaded above
os.remove(latest_driver_zip)

RUBY Here is the solution implemented in ruby for windows.

require 'net/http'
require 'open-uri'
require 'zip'

# Method to extract the contents of the zip file to the destination path
def extract_zip(file, destination)
  # create the destination folder if it's not exist
  FileUtils.mkdir_p(destination)
  Zip::File.open(file) do |zip_file|
    zip_file.each do |f|
      fpath = File.join(destination, f.name)
      zip_file.extract(f, fpath) unless File.exist?(fpath)
    end
  end
end

# url where you can get the latest chrome driver information
url = 'https://chromedriver.storage.googleapis.com/LATEST_RELEASE'

# get the latest driver version
parsed_url = URI.parse(url)
http = Net::HTTP.new(parsed_url.host, parsed_url.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_PEER
request = Net::HTTP::Get.new(url)
request["Accept"] = 'application/json'
response = http.request(request)

# build the download url based on the above step
download_url = "https://chromedriver.storage.googleapis.com/#{response.body}/chromedriver_win32.zip"
puts download_url

# build the temporary location where you want to store the donwload zip
download_path = File.join(ENV['TEMP'],response.body.gsub('.',"_")) + '.zip'
puts download_path

# download the zip file
File.open(download_path, "wb") do |file|
  file.write open(download_url).read
end

# Extract the chromedriver.exe from the zip in specific location
extract_zip(download_path,"my_destination_folder_path") # don't specify the filename.

# delete the zip file
FileUtils.rm_rf(download_path)

This .rb file will be executed as and when I get the error that the driver does not support the chrome version exception and recover the execution after downloading the chromedriver. So that way, script my scripts never fail due to the chrome version changes.

supputuri
  • 13,644
  • 2
  • 21
  • 39
  • Thanks, I still need to look into this and try it but I'll come back and comment/accept when I have a chance. – Jimbo Jan 09 '20 at 20:53
  • Thanks for the suggested answer. Of course the first time I went to test it I was on a mac! Then apparently I am using Chrome 79 and the latest release is Chrome 80 so I couldn't run it and I had to manually specify the version to use. I also needed to run `os.chmod` to give the file the necessary permissions to execute and add `wget` at which point simply downloading the manager is probably the way to go. – Jimbo Feb 07 '20 at 03:05
  • @Jimbo Please check the updated answer with `webdriver_manager` usage. Please let me know if you have any questions. – supputuri Mar 24 '20 at 03:29
  • Yeah, the new answer is probably more what people are looking for. The old answer required a bit more work then I was looking for and I think ran into problems with versioning (local Chrome versus the downloaded driver) – Jimbo Mar 24 '20 at 23:40
  • Glad that we finally nailed it out. – supputuri Mar 24 '20 at 23:41