Python Web Crawler No Results

Question

I am making a basic Web Crawler/Spider with Python. I am trying to crawl through a YouTube channel and print all the titles of the videos on it but it never returns anything.

Here is my code so far:

import requests
from bs4 import BeautifulSoup

url = 'https://www.youtube.com/c/DanTDM/videos'

response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

x = soup.select(".yt-simple-endpoint style-scope ytd-grid-video-renderer")

print(x)

And the output is always: []. An empty list (which means it didn't find anything). I need to know what I'm doing wrong.

Jonathan Ciapetti · Answer 1 · 2022-07-18T05:09:01.793

1

The code seems correct.

Call print(response.text) and see if YouTube is returning you a blocking page.

Anti scraping measures can be in action, as checking your user agent, etc.

edited Jul 18 '22 at 05:09

answered Jul 17 '22 at 23:52

Jonathan Ciapetti

1,261
3
11
16

1

yeah, could also be that some javascript is running client side that isnt being picked up. See related question. https://stackoverflow.com/questions/26393231/using-python-requests-with-javascript-pages – bartius Jul 18 '22 at 01:24
1

No no, I see what's happening. My response was HTML, of course and it was the 'Before you continue to YouTube' page. I would need to use Selenium and WebDriver to automate it and then maybe run the other code or make a different Web Crawler using Selenium. – blackscratch22 Jul 18 '22 at 09:14
Yes, with "blocking page" I meant also a dialog with dynamic content. No wonder why YouTube made the dialog that way ... – Jonathan Ciapetti Jul 21 '22 at 01:11

blackscratch22 · Answer 2 · 2022-07-21T18:51:40.550

1

Browser Automation with Selenium

When I send a request to YouTube, I receive the following page:

(A 'Before you continue to Youtube' page).

So...

We should use Selenium instead as we need to click one of the buttons. I don't think we can interact with the website using the requests module.

Selenium allows you to have control over your browser. Read the documentation!

edited Jul 21 '22 at 18:51

answered Jul 19 '22 at 11:14

blackscratch22

80
7

Python Web Crawler No Results

2 Answers2

Browser Automation with Selenium