0

The website is https://www.punters.com.au/stats/ and the HTML button is Download CSV

import requests
from bs4 import BeautifulSoup
base_url = 'https://www.punters.com.au'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3','referer': 'https://www.google.com/'}

# Make a request to the URL and parse the HTML content
response = requests.get(base_url + '/stats/', headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
# <a class="stats-csv" href="javascript:void(-1);">Download CSV</a>
csv = soup.find('a', class_="stats-csv")
print(csv)

This finds the button, but how is it executed?
Selenium appears to be the only option, and after matching the ChromeDriver I'm plodding away and have added the following code, which finds the download button but haven't yet managed the download.

# import modules
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome() 
print("fetching csv button for {} ".format(base_url))
driver.get(base_url + '/stats/')
driver.implicitly_wait(2) # seconds
try:
  assert "Free Australian Horse Racing Statistics - Punters.com.au" in driver.title
  elem = driver.find_element(By.CLASS_NAME,"stats-csv").send_keys("webdriver" + Keys.ENTER)
except:
  print ("\n%Error-website: {}\n".format(base_url + '/stats/'))    
Mr Ed
  • 83
  • 2
  • 12
  • The button runs Javascript code. You will need Selenium (or equivalent) to run that code. – Tim Roberts Apr 11 '23 at 05:48
  • I was looking for a solution that didn't use Selenium but is that my only option? – Mr Ed Apr 11 '23 at 05:59
  • I recall now why I was avoiding Selenium - selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 81 – Mr Ed Apr 11 '23 at 06:46
  • ChromeDriver mismatch solved - https://stackoverflow.com/a/73366828/3129642 – Mr Ed Apr 11 '23 at 07:15

2 Answers2

0

So maybe there is a way to do this without using Selenium but after the comments I've settled on using Selenium, here's the solution:

import requests
driver = webdriver.Chrome() 
base_url = 'https://www.punters.com.au'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3','referer': 'https://www.google.com/'}
driver.get(base_url + '/stats/')
driver.implicitly_wait(2) # seconds
# The next line is an assertion to confirm that title "<title>Free Australian Horse Racing Statistics - Punters.com.au</title>"
assert "Free Australian Horse Racing Statistics - Punters.com.au" in driver.title
# look for <a class="stats-csv" href="javascript:void(-1);">Download CSV</a>
elem = driver.find_element(By.CLASS_NAME,"stats-csv").click()  # click on Download CSV button
elem = driver.find_element(By.CLASS_NAME, "Save").click()
driver.close()

The downloaded file is punters-com-au-downloaded-stats-jockeys-YYYY-MM-DD HH_MM_SS.csv. Thanks for your help!

Mr Ed
  • 83
  • 2
  • 12
-2

You can use the Beautiful Soup and Requests library for this.

Hyperb
  • 1
  • 2