0

I have been looking EVERYWHERE for some form of help on any method on python to web scrape all nba props from app.prizepicks.com. I came down to 2 potential methods: API with pandas and selenium. I believe prizepicks recently shut down their api system to restrain from users from scraping the nba props, so to my knowledge using selenium-stealth is the only way possible to web scrape the prizepicks nba board. Can anyone please help me with, or provide a code that scrapes prizepicks for all nba props? The information needed would be the player name, prop type (such as points, rebounds, 3-Pt Made, Free throws made, fantasy, pts+rebs, etc.), prop line (such as 34.5, 8.5, which could belong to a prop type such as points and rebounds, respectively). I would need this to work decently quickly and refresh every set amount of minutes. I found something similar to what i would want provided in another thread by 'C. Peck'. Which I will provide (hopefully, i dont really know how to use stackoverflow). But the code that C. Peck provided does not work on my device and i was wondering if anyone here write a functional code/fix this code to work for me. I have a macbook pro so i dont know if that affects anything.

EDIT After a lot of trial and error, and help from the thread, I have managed to complete the first step. I am able to webscrape from the "Points" tab on the nba league of prizepicks, but I want to scrape all the info from every tab, not just points. I honestly dont know why my code isnt fully working, but i basically want it to scrape points, rebounds, assists, fantasy, etc... Let me know any fixes i should do to be able to scrape for every stat_element in the stat_container, or other methods too! Ill update the code below:

EDIT AGAIN it seems like the problem lies in the "stat-container" and "stat-elements". I checked to see what elements the "stat-elements" has, and it is only points. I checked to see what elements the "stat-container" has, and it gave me an error. I believe if someone helps me with that then the problem will be fixed. This is the error it gives when i try to see the elements inside of "stat-container": line 27, in for element in stat_container: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'WebElement' object is not iterable

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import pandas as pd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://app.prizepicks.com/")


driver.find_element(By.CLASS_NAME, "close").click()


time.sleep(2)

driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='NBA']").click()

time.sleep(2)

# Wait for the stat-container element to be present and visible
stat_container = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "stat-container")))

# Find all stat elements within the stat-container
stat_elements = driver.find_elements(By.CSS_SELECTOR, "div.stat")

# Initialize empty list to store data
nbaPlayers = []

# Iterate over each stat element
for stat in stat_elements:
    # Click the stat element
    stat.click()

    projections = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".projection")))

    for projection in projections:

        names = projection.find_element(By.XPATH, './/div[@class="name"]').text
        points= projection.find_element(By.XPATH, './/div[@class="presale-score"]').get_attribute('innerHTML')
        text = projection.find_element(By.XPATH, './/div[@class="text"]').text
        print(names, points, text)

        players = {
            'Name': names,
            'Prop':points, 'Line':text
            }

        nbaPlayers.append(players)
   

df = pd.DataFrame(nbaPlayers)
print(df)

driver.quit()
         
joey
  • 9
  • 4

1 Answers1

1

Answer updated with the working code. I made some small changes

  • Replaced stat_elements with categories, which is a list containing the stat names.

  • Loop over categories and click the div button with label equal to the current category name

  • Add .replace('\n','') at the end of the text variable

.

driver.get("https://app.prizepicks.com/")

driver.find_element(By.CLASS_NAME, "close").click()
time.sleep(2)
driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='NBA']").click()
time.sleep(2)

# Wait for the stat-container element to be present and visible
stat_container = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "stat-container")))

# Find all stat elements within the stat-container
# i.e. categories is the list ['Points','Rebounds',...,'Turnovers']
categories = driver.find_element(By.CSS_SELECTOR, ".stat-container").text.split('\n')

# Initialize empty list to store data
nbaPlayers = []

# Iterate over each stat element
for category in categories:
    # Click the stat element
    line = '-'*len(category)
    print(line + '\n' + category + '\n' + line)
    driver.find_element(By.XPATH, f"//div[text()='{category}']").click()

    projections = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".projection")))

    for projection in projections:

        names = projection.find_element(By.XPATH, './/div[@class="name"]').text
        points= projection.find_element(By.XPATH, './/div[@class="presale-score"]').get_attribute('innerHTML')
        text = projection.find_element(By.XPATH, './/div[@class="text"]').text.replace('\n','')
        print(names, points, text)

        players = {'Name': names, 'Prop':points, 'Line':text}

        nbaPlayers.append(players)

pd.DataFrame(nbaPlayers)
sound wave
  • 3,191
  • 3
  • 11
  • 29
  • it should work but it doenst give me any results. I keep getting this code: " driver.find_element_by_class_name("close").click() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'WebDriver' object has no attribute 'find_element_by_class_name'" – joey Jan 18 '23 at 15:26
  • @JesusVargas Replace that command with `driver.find_element(By.CLASS_NAME, "close")` and try again – sound wave Jan 18 '23 at 15:52
  • @JesusVargas Replace all occurences of `find_element_by_xpath("...")` with `find_element(By.XPATH, "...")` – sound wave Jan 18 '23 at 16:13
  • @JesusVargas Add `time.sleep(5)` after `driver.find_element(By.CLASS_NAME, "close").click()`. I think the problem is that after it closes the "How to play" popup, it immediately tries to click on "NBA" but the popup is not closed yet – sound wave Jan 18 '23 at 16:31
  • alright that worked, but theres another problem! I made a small addition to print out "Button clicked" when it successfully clicks it, and for the popup, and for nba it works. I added some extra time to load to click the "selector-button" but it doesnt work, it seems like thats an issue as well. – joey Jan 18 '23 at 16:39
  • @JesusVargas If you open the DevTools on that page (right click -> Inspect) and press CTRL+F and then paste `segment-selector-button` you will see 0 results, this means there are no elements with that class. What are you trying to click? – sound wave Jan 18 '23 at 16:50
  • hey @Soundwave! I have updated the code, can you help me with that final fix? I would greatly appreciate it. – joey Jan 18 '23 at 17:56
  • @JesusVargas Replace `stat_elements = stat_container.find_elements(By.XPATH,"//div[@class='stat' or @class='stat stat-active']")` with this one `stat_elements = driver.find_elements(By.CSS_SELECTOR, "div.stat")` – sound wave Jan 18 '23 at 18:11
  • It started to print, and then when it finished the "Points" it gave an error. Maybe the error is in the stat-container? – joey Jan 18 '23 at 18:16
  • @JesusVargas Put the error in the question – sound wave Jan 18 '23 at 18:17
  • It doesnt let me, ill attach it here: – joey Jan 18 '23 at 18:21
  • File "/Users/jesusvargas/Desktop/VSCODE/Python/SCRAPE/selenium_test", line 35, in stat.click() File "/Users/jesusvargas/Desktop/VSCODE/env/lib/python3.11/site-packages/selenium/webdriver/remote/webelement.py", line 93, in click self._execute(Command.CLICK_ELEMENT) File "/Users/jesusvargas/Desktop/VSCODE/env/lib/python3.11/site-packages/selenium/webdriver/remote/webelement.py", line 410, in _execute return self._parent.execute(command, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ – joey Jan 18 '23 at 18:21
  • File "/Users/jesusvargas/Desktop/VSCODE/env/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 444, in execute self.error_handler.check_response(response) File "/Users/jesusvargas/Desktop/VSCODE/env/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 249, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document (Session info: chrome=109.0.5414.87) Stacktrace: – joey Jan 18 '23 at 18:21
  • Stacktrace: 0 chromedriver 0x0000000100ebafa8 chromedriver + 4886440 1 chromedriver 0x0000000100e38643 chromedriver + 4351555 2 chromedriver 0x0000000100a86b27 chromedriver + 477991...... etc..... – joey Jan 18 '23 at 18:21
  • In the question there is still the old `stat_elements`, replace it with the one I told you in the previous comment `stat_elements = driver.find_elements(By.CSS_SELECTOR, "div.stat")` – sound wave Jan 18 '23 at 18:24
  • Done! What else do i fix for the error? I still think the error lies in the stat-container. – joey Jan 18 '23 at 18:31
  • @JesusVargas I updated the answer with working code :) – sound wave Jan 18 '23 at 21:33
  • WOW!!!! IT WORKS PERFECTLY!! Thank you so much sound wave this helps me out incredibly. I was still trying to figure it out and your code does it wonderfully!! – joey Jan 18 '23 at 22:03
  • Hey @soundwave, if i want to make it headless (i dont want the browser to show up) and to only print the database , how can i do it? I looked up a video on how to make it headless but its not working – joey Jan 18 '23 at 22:25
  • @JesusVargas To run browser in headless mode you have to do `options = webdriver.ChromeOptions(); options.headless = True; driver = webdriver.Chrome(PATH, options=options)` however I tried this but then when laoding the url I get a blank page, i.e. if i print the html with `print(driver.page_source)` it returns `` – sound wave Jan 19 '23 at 08:28
  • @joey If the problem is solved consider marking the answer as accepted by clicking ✓ on the left – sound wave Apr 18 '23 at 20:27