2

I am trying to scrape data frame the following website

http://stats.nba.com/game/0041700404/playbyplay/

I'd like to create a table that includes the date of the game, the scores throughout the game, and the team names

I am using the following code:

game1 <- read_html("http://stats.nba.com/game/0041700404/playbyplay/")

#Extracts the Date
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team--vtm", " " ))]//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team__lineup", " " ))]')

#Extracts the Score
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "status", " " ))]//*[contains(concat( " ", @class, " " ), concat( " ", "score", " " ))]')

#Extracts the Team names
html_nodes(game1, xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "game-summary-team__name", " " ))]//a')

Unfortunately, I get the following

{xml_nodeset (0)}
{xml_nodeset (0)}
{xml_nodeset (0)}

I have seen a bunch of questions and answers to this problem but none of them seem to help.

wp78de
  • 18,207
  • 7
  • 43
  • 71
user8304241
  • 241
  • 1
  • 11

2 Answers2

1

Unfortunately, rvest does not play well with dynamically created, JavaScript pages. It works best with static HTML web pages.

I would suggest taking a look at RSelenium. Finally, I got something out of the page using the rsDriver

Code Sample:

library(RSelenium)
rD <- rsDriver() # runs a chrome browser, wait for necessary files to download
remDr <- rD$client
#no need for remDr$open() browser should already be open
remDr$navigate("http://stats.nba.com/game/0041700404/playbyplay/")

teams <- remDr$findElement(using = "xpath", "//span[@class='team-full']")
teams$getElementText()[[1]]
# and so on...

remDr$close()
# stop the selenium server
rD[["server"]]$stop() 
# if user forgets to stop server it will be garbage collected.
rD <- rsDriver()
rm(rD)
gc(rD)

and so on...

PS: I had some trouble to install it on Windows with current R * this worked * How to set up rselenium for R?

wp78de
  • 18,207
  • 7
  • 43
  • 71
1

I had success with the splashr package in R. To install you need docker. Installation instructions are mentioned in the websites listed below

https://cran.r-project.org/web/packages/splashr/vignettes/intro_to_splashr.html

https://docs.docker.com/docker-for-mac/install/#install-and-run-docker-for-mac - how to install and run docker on a mac

https://splash.readthedocs.io/en/stable/install.html - type these codes into the terminal window before using splashr

user8304241
  • 241
  • 1
  • 11
  • Welcome to Stack Overflow! While links are great way of sharing knowledge, they won't really answer the question if they get broken in the future. Add to your answer the essential content of the link which answers the question. In case the content is too complex or too big to fit here, describe the general idea of the proposed solution. Remember to always keep a link reference to the original solution's website. See: [How do I write a good answer?](https://stackoverflow.com/help/how-to-answer) – sɐunıɔןɐqɐp Sep 21 '18 at 08:10