I tried to extract the text content of comments from a web page using its URL link, and I used BeautifulSoup for scraping. The content of comments is visible on the page when I clicked the URL link, but the HTML object returned by BeautifulSoup did not contain these tags and texts.
I used BeautifulSoup with 'html.parser' to do the web scraping. I successfully extracted the number of likes/views/comments of the video in the given webpage, but the information of comment sections was not included in the HTML file. The browser I used was Chrome, and the system is Ubuntu 18.04.1 LTS.
This is the codes I used (in python):
from urllib.request import urlopen
from bs4 import BeautifulSoup
import os
webpage_link = "https://www.airvuz.com/video/Majestic-Beast-Nanuk?id=59b2a56141ab4823e61ea901"
try:
page = urlopen(webpage_link)
except urllib.error.HTTPError as err: # webpage cannot be found
print("ERROR! %s" %(webpage_link))
soup = BeautifulSoup(page, 'html.parser')
The expected result is the soup object contains all the content which is visible on the webpage especially the text content of comments (like "Not being there I enjoyed a lot seeing the life style of white bear. Thanks to the provider for such documentary." and "WOOOW... amazing..."); however, I could not find the corresponding nodes in the soup object. Any help would be appreciated!