1

The html returned is incomplete and half of the image links are not visible. Take a look at this.

Code:

import requests
from bs4 import BeautifulSoup
import json
import re

user_agent = {"User-Agent": "Mozilla/5.0 "
                            "(Windows NT 10.0; Win64; x64) "
                            "AppleWebKit/537.36 (KHTML, like Gecko) "
                            "Chrome/80.0.3987.163 Safari/537.36"}

data = input("Search: ")
n = int(input("Number: "))

url = f'https://9gag.com/search?query={data}'

print(url, "\n")

source = requests.get(url,headers = user_agent).text

soup = BeautifulSoup(source,"lxml")
print(soup.prettify())

for images in soup.findAll("div",class_ = "post-cotainer",limit=n):
    print(images,"\n")

The output is not showing the required "class = post-container" containing the image link:-

Required HTML Snippet

2t2c
  • 41
  • 5
  • please replace data and n with hard coded variables so everybody can reproduce the same effect – Ari Gold Apr 11 '20 at 11:46
  • Does this answer your question? [Scrape images from 9gag, unable to read correct HTML-page](https://stackoverflow.com/questions/57008817/scrape-images-from-9gag-unable-to-read-correct-html-page) – αԋɱҽԃ αмєяιcαη Apr 11 '20 at 12:40
  • @αԋɱҽԃαмєяιcαη Unfortunately no, but I did a little research about this and turns out Requests library only scrapes the static data not dynamic. So, I have to approach differently using Selenium. – 2t2c Apr 11 '20 at 12:53
  • @2t2c then can you [edit] your question and provide for us the expected output? so i can check if it's possible with requests module – αԋɱҽԃ αмєяιcαη Apr 11 '20 at 12:54

0 Answers0