1

I'm trying to scrape a site, but beautiful soup isn't returning any HTML code that I see when inspecting manually. The return soup also includes a phrase, "You are using an outdated browser".

I have tried different parsers and also using the urllib module.

I have a strong feeling that it's due to the time it takes for the website to load because the code works for some other websites. Is there a way to stall beautiful soup to wait till the entire page ha loaded?

Here's my code:

import requests
from bs4 import BeautifulSoup

def get_stock():
    URL = 'https://www.cse.lk/home/tradeSummary'
    page = requests.get(URL)
    soup = BeautifulSoup(page.content, 'html.parser')
    print(soup.get_text())
    stocks = soup.find_all("td")
    for stock in stocks:
        line = stock.get_text()
        print(line)

get_stock()

Thank you:)

Vimuth
  • 63
  • 1
  • 6
  • 2
    Try viewing the Page Source. Dynamic websites built using frameworks (such as JS frameworks React) can't be scraped using BeautifulSoup. Try using `Selenium` instead. – Vishal Dhawan Jul 04 '20 at 07:28
  • 2
    The stocks in this specific page aren't loaded as part of the initial HTTP call. They are loaded by a *POST* request to https://www.cse.lk/api/tradeSummary. You can see that in the developer tools in your browser. – Roy2012 Jul 04 '20 at 07:29

0 Answers0