I am working on a project in which i have to scrape images related to a keyword from a image site. When i search for any keyword on imgur(My choice for the image site), the results are shown as small thumbnails which when clicked open the main article with various images on it. My program for now makes a list of various links in the thumbnails and opens them one by one to download all images in it.
My problem is that when i inspect image element on the article it shows that the image is in class ".image-placeholder" but when i download the html by request method it does not show any such class available.
One of the examples is that when i request response object of article https://i.stack.imgur.com/NjBPc.jpg.
I get below html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<meta content="width=device-width,initial-scale=1" name="viewport"/>
<meta content="funny, image, gif, gifs, memes, jokes, image upload, upload image, lol, humor, vote, comment, share, imgur, imgur.com, wallpaper" name="keywords">
<meta content="Discover the magic of the internet at Imgur, a community powered entertainment destination. Lift your spirits with funny jokes, trending memes, entertaining gifs, inspiring stories, viral videos, and so much more." name="description">
<meta content="Copyright 2020 Imgur, Inc." name="copyright"/>
<link href="https://s.imgur.com/images/favicon-32x32.png" rel="icon" sizes="32x32" type="image/png"/>
<link href="https://s.imgur.com/images/favicon-96x96.png" rel="icon" sizes="96x96" type="image/png"/>
<link href="https://s.imgur.com/images/favicon-16x16.png" rel="icon" sizes="16x16" type="image/png"/>
<link href="https://s.imgur.com/images/favicon-152.png" rel="apple-touch-icon-precomposed"/>
<meta content="#2cd63c" name="msapplication-TileColor"/>
<meta content="https://s.imgur.com/images/favicon-144.png" name="msapplication-TileImage"/>
<link href="https://m.imgur.com/" media="only screen and (max-width: 640px)" rel="alternate"/>
<meta content="834554521765408b9effdc758b69c5ee" name="p:domain_verify">
<meta content="Imgur" property="og:site_name">
<meta content="12331492" property="fb:admins"/>
<meta content="12301369" property="fb:admins"/>
<meta content="127621437303857" property="fb:app_id"/>
<meta content="imgur://imgur.com/?from=fbreferral" property="al:android:url"/>
<meta content="Imgur" property="al:android:app_name"/>
<meta content="com.imgur.mobile" property="al:android:package"/>
<meta content="imgur://imgur.com/?from=fbreferral" property="al:ios:url"/>
<meta content="639881495" property="al:ios:app_store_id"/>
<meta content="Imgur" property="al:ios:app_name"/>
<meta content="https://imgur.com/" property="al:web:url"/>
<meta content="@imgur" name="twitter:site"/>
<meta content="imgur.com" name="twitter:domain"/>
<meta content="com.imgur.mobile" name="twitter:app:id:googleplay"/>
<meta content="Imgur" property="author"/>
<meta content="Imgur" property="article:author"/>
<meta content="https://www.facebook.com/imgur" property="article:publisher"/>
<title>
Drawing Model (for you all naruto fans) - Imgur
</title>
<meta content="https://i.stack.imgur.com/NjBPc.jpg" data-react-helmet="true" property="og:url"/>
<meta content="https://i.imgur.com/rfEAUDWh.jpg" data-react-helmet="true" name="twitter:image"/>
<link href="https://api.imgur.com/oembed.json?url=https://i.stack.imgur.com/NjBPc.jpg" rel="alternate" title="Drawing Model (for you all naruto fans) - Imgur" type="application/json+oembed"/>
<link href="https://api.imgur.com/oembed.xml?url=https://i.stack.imgur.com/NjBPc.jpg" rel="alternate" title="Drawing Model (for you all naruto fans) - Imgur" type="application/xml+oembed"/>
<meta content="funny" data-react-helmet="true" property="article:tag"/>
<meta content="" data-react-helmet="true" property="article:tag"/>
<meta content="" data-react-helmet="true" property="article:tag"/>
<meta content="" data-react-helmet="true" property="article:tag"/>
<meta content="" data-react-helmet="true" property="article:tag"/>
<meta href="https://i.stack.imgur.com/NjBPc.jpg" rel="canonical"/>
<meta content="none" name="robots"/>
<meta content="600" data-react-helmet="true" property="og:image:width"/>
<meta content="315" data-react-helmet="true" property="og:image:height"/>
<meta content="https://i.imgur.com/rfEAUDW.jpg?fb" data-react-helmet="true" property="og:image"/>
<meta content="article" data-react-helmet="true" property="og:type"/>
<meta content="summary_large_image" data-react-helmet="true" name="twitter:card"/>
<script>
dataLayer=[];var pbjs=pbjs||{};pbjs.que=pbjs.que||[]
</script>
<script>
!function(e,t,a,n,g){e[n]=e[n]||[],e[n].push({"gtm.start":(new Date).getTime(),event:"gtm.js"});var m=t.getElementsByTagName(a)[0],r=t.createElement(a);r.async=!0,r.src="//www.googletagmanager.com/gtm.js?id=GTM-M6N38SF",m.parentNode.insertBefore(r,m)}(window,document,"script","dataLayer")
</script>
<link href="https://s.imgur.com/desktop-assets/css/styles.ebc99cf807f6b7c8c39c.css" rel="stylesheet"/>
</meta>
</meta>
</meta>
</meta>
</head>
<body>
<noscript>
<iframe height="0" src="https://www.googletagmanager.com/ns.html?id=GTM-M6N38SF" style="display:none;visibility:hidden" width="0">
</iframe>
</noscript>
<noscript>
If you're seeing this message, that means
<strong>
JavaScript has been disabled on your browser
</strong>
, please
<strong>
enable JS
</strong>
to make Imgur work.
</noscript>
<div id="root">
</div>
<script async="" src="https://www.googletagmanager.com/gtag/js?id=UA-6671908-15">
</script>
<script>
function gtag(){dataLayer.push(arguments)}window.dataLayer=window.dataLayer||[],gtag("js",new Date),gtag("config","UA-6671908-15",{send_page_view:!1})
</script>
<script class="abp" src="https://s.imgur.com/min/px.js?ch=1">
</script>
<script class="abp" src="https://s.imgur.com/min/px.js?ch=2">
</script>
<script src="https://s.imgur.com/desktop-assets/js/main.2e34f379cd8d1a3ca8b1.js">
</script>
</body>
</html>
Thanks in advance for all your help.
My code is:
import requests
from bs4 import BeautifulSoup
from pathlib import Path
import os
#UserInput for phrase to search.
phrase = input('Enter text to search: ')
finalphrase = phrase.replace(" ", "+")
print('Searching for ' + finalphrase + '.....')
#makes soup of the search page and collect all results as galleryElem
site = 'https://imgur.com'
res = requests.get(site + '/search?q=' + phrase)
soup = BeautifulSoup(res.text, 'html.parser')
galleryElem = soup.select('.image-list-link')
#For each thumbnail will open it and download its content
for n in range(len(galleryElem)):
imageLink = galleryElem[n].get('href')
res1 = requests.get(site+imageLink)
soup1 = BeautifulSoup(res1.text, 'html.parser')
imageElem = soup.select('.image-placeholder')
for m in range(len(imageElem)):
image = imageElem[m].get('src')
img = open(os.path.join(finalphrase, number, os.path.basename(image), 'wb'))
number += 1
for a in image.iter_content(100000):
img.write(a)
img.close()