0

I am trying to scrape text from the highlights and full commentary section of the following link of the cricbuzz page. The text is just not coming up after trying for a few days. Sorry but I am a beginner and I don"t really know a lot about webscraping.

I have tried a few other sections and I was able to scrape text and tables from them, but this section has tabbed or clickable text data which I don't know how to scrape for this particular page. Following is my code so far for the highlights section:

code:

from urllib.request import urlopen as req
from bs4 import BeautifulSoup as soup

my_url = "https://www.cricbuzz.com/cricket-match-highlights/20567/ausw-vs-nzw-10th-match-group-b-icc-womens-world-t20-2018"
uclient = req(my_url)
page_html = uclient.read()
uclient.close()
page_soup = soup(page_html, "html.parser")
highlights = page_soup.find_all("div",{"class":"cb-col cb-col-67 cb nws-lft-col"})

for highlight in highlights:
    text_highlight = highlights.text
    print(text_highlight)
  • If I navigate to the page in question, there are no `nws-lft-col` class elements on the page. What output were you expecting? – ggorlen Jan 01 '19 at 16:17
  • The link that you have given and link in your code are different. – Bitto Jan 01 '19 at 16:31
  • Apart from the errors in your script (2056 should be 20567, `cb nws-lft-col` should be `cb-nws-lft-col`, `highlights.text` should be `highlight.text`), you are targetting a dynamic website which has reactjs style of code which dynamically replaces HTML and retrieves content asynchronously. You need more than beautifulsoup for that. See [this answer](https://stackoverflow.com/a/17599897/5459839) to maybe get it done with Selenium. – trincot Jan 01 '19 at 18:37
  • Check out the links [highlights](https://www.cricbuzz.com/match-api/20567/highlights.json) and [commentary](https://www.cricbuzz.com/match-api/20567/commentary-full.json) to parse json content from there. – SIM Jan 01 '19 at 19:51
  • Sorry for the minor errors. I am new to this and kinda suck at it. And yes @trincot I'll take a look at that answer. Thank you all and sorry for the trouble with the errors in the question. – Aniket Angwalkar Jan 03 '19 at 14:31

0 Answers0