I want to find the image that best represents a web page.
My code is :
from bs4 import BeautifulSoup #Import stuff
import requests
r = requests.get("http://www.test.com/") #Download website source
data = r.text #Get the website source as text
soup = BeautifulSoup(data) #Setup a "soup" which BeautifulSoup can search
links = []
for link in soup.find_all('img'): #Cycle through all 'img' tags
imgSrc = link.get('src') #Extract the 'src' from those tags
links.append(imgSrc) #Append the source to 'links'
print(links) #Print 'links'
I know this three method might be useful:
Check for Open Graph/Twitter Card tags Find the largest suitable image on the page Look for a suitable video thumbnail if no image is found
I want to get biggest images in pages based on their dimensions.
I did a research on this but I couldn't find something good and fast executable.