I would like to go through the below web pages and save the respective images using python:
Examples (total of 10.000 websites):
https://cryptopunks.app/cryptopunks/cryptopunk0001.png
https://cryptopunks.app/cryptopunks/cryptopunk0002.png
https://cryptopunks.app/cryptopunks/cryptopunk0002.png
https://cryptopunks.app/cryptopunks/cryptopunk9999.png
My goal is to use the images in a GAN afterward for project work and create images by doing so.
I tried adapting the below code to the above exemplary websites, but unfortunately, I cannot make it work. (Loop through webpages and download all images):
from bs4 import BeautifulSoup as soup
import requests, contextlib, re, os
@contextlib.contextmanager
def get_images(url:str):
d = soup(requests.get(url).text, 'html.parser')
yield [[i.find('img')['src'], re.findall('(?<=\.)\w+$', i.find('img')['alt'])[0]] for i in d.find_all('a') if re.findall('/image/\d+', i['href'])]
n = 3 #end value
os.system('mkdir MARCO_images') #added for automation purposes, folder can be named anything, as long as the proper name is used when saving below
for i in range(n):
with get_images(f'https://marco.ccr.buffalo.edu/images?page={i}&score=Clear') as links:
print(links)
for c, [link, ext] in enumerate(links, 1):
with open(f'MARCO_images/MARCO_img_{i}{c}.{ext}', 'wb') as f:
f.write(requests.get(f'https://marco.ccr.buffalo.edu{link}').content)
Could anyone please help me out?
Thanks a lot!