I'm trying to write a script in Python that downloads the image on this site that is updated every day:
https://apod.nasa.gov/apod/astropix.html
I was trying to follow the top comment from this post: How to extract and download all images from a website using beautifulSoup?
So, this is what my code currently looks like:
import re
import requests
from bs4 import BeautifulSoup
site = 'https://apod.nasa.gov/apod/astropix.html'
response = requests.get(site)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img')
urls = [img['src'] for img in img_tags]
for url in urls:
filename = re.search(r'/([\w_-]+[.](jpg|gif|png))$', url)
with open(filename.group(1), 'wb') as f:
if 'http' not in url:
# sometimes an image source can be relative
# if it is provide the base url which also happens
# to be the site variable atm.
url = '{}{}'.format(site, url)
response = requests.get(url)
f.write(response.content)
However, when I run my program I get this error:
Traceback on line 17
with open(filename.group(1), 'wb' as f:
AttributeError: 'NoneType' object has no attribute 'group'
So it looks like there is some problem with my Regex perhaps?