1

I have the following Python script:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.ncaa.com/game/3518260/play-by-play'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')

game = soup.find(id='gamecenter-tab-play-by-play')

print(game)

Which is producing 'None' when I run it.

enter image description here

This is the inspection of the websites code: enter image description here

Can someone explain to me why the code is failing to find the div that I am point to? I have also tried with .find_all but that is not working either.

Thank you for your time and if there is anything I can supply to help clarify, please let me know.

DougM
  • 920
  • 1
  • 9
  • 21
  • 1
    `game = soup.findAll("div", {'class': 'gamecenter-tab-play-by-play' })` returns the div you're seeking. However it returns an empty list. – PacketLoss Jan 17 '20 at 05:44
  • 1
    Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python). You're searching by id but your element shows a class. But if the data is injected with JS, you won't get the data you want by requesting and parsing the HTML even with the correct DOM name. You'll need a tool like Selenium or make a request to the JSON endpoint that the page is using. – ggorlen Jan 17 '20 at 05:45

1 Answers1

1

The above comment by ggorlen is the answer you are looking for.

If you want to scrape using selenium you need to download driver for your browser.

ex: if you want to use chrome you can get the chrome driver from https://chromedriver.chromium.org/

Note: get the same version as your current browser or it won't work

from bs4 import BeautifulSoup
from selenium import webdriver

    def get_scraped_data():
        chrome_options = webdriver.ChromeOptions()
        driver = webdriver.Chrome(options=chrome_options,
                              executable_path=<chrome driver path>)
        driver.get(url='https://www.ncaa.com/game/3518260/play-by-play')
        values = driver.page_source
        driver.close()
        driver.quit()
        soup = BeautifulSoup(values, 'html.parser')
        game = soup.findAll("div", {'class': 'gamecenter-tab-play-by-play'})
        return game

If you want to run the pop up window in background you can use the below option to the driver

chrome_options.add_argument('--headless')

Vignesh Krishnan
  • 743
  • 8
  • 15