I'm testing this code, to try to download around 120 Excel files from one URL.
import requests
from bs4 import BeautifulSoup
headers={"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36"}
resp = requests.get("https://healthcare.ascension.org/price-transparency/price-transparency-files",headers=headers)
soup = BeautifulSoup(resp.text,"html.parser")
for link in soup.find_all('a', href=True):
if 'xls' in link['href']:
print(link['href'])
url="https://healthcare.ascension.org"+link['href']
data=requests.get(url)
print(data)
output = open(f'C:/Users/ryans/Downloads/{url.split("/")[-1].split(".")[0]}.xls', 'wb')
output.write(data.content)
output.close()
This line: data=requests.get(url)
Always geive me Response [406] results. Apparently, HTTP 406 is a status of "Not Acceptable" per HTTP.CAT and Mozilla. Not sure what is wrong here, but I think I should have 120 Excel files, with data, downloaded. Now, I am getting 120 Excel files on my laptop, but none of the files have any data in them.