unable to retrieve all tables while scraping a webpage with R

Asked Dec 15 '16 at 04:39

Active Dec 15 '16 at 04:59

Viewed 34 times

I want to scrape the "Team Per Game Stats" table on http://www.basketball-reference.com/leagues/NBA_1996.html, I've tried the following code

webpage <- getURL("http://www.basketball-reference.com/leagues/NBA_1996.html")
tables <- readHTMLTable(webpage)

I've also tried to parse it

webpage <- getURL("http://www.basketball-reference.com/leagues/NBA_1996.html")
webpage <- readLines(tc <- textConnection(webpage))
pagetree <- htmlTreeParse(webpage, useInternalNodes = TRUE)
xpathApply(pagetree, "//table", xmlValue)

Both codes only give me the two tables under "Division Standings", whereas there should be more than 10 tables on that webpage.

Also, when I search "//table[@id='team-stats-per_game']" under inspect element on the webpage, it leads me right to the table I want, but R returns NULL when I try to find the same table with xpathApply.

what am I missing here? Thanks in advance.

edited Dec 15 '16 at 04:59

Hack-R

22,422
14
75
131

asked Dec 15 '16 at 04:39

jc127

that was a typo, I did use the quotation marks in the actual code @Hack-R – jc127 Dec 15 '16 at 04:55
Right, I mean in your question. I went ahead and did it for you. – Hack-R Dec 15 '16 at 04:57
That website has been asked about several times. If you don't get the page in such a manner that it can run its JavaScript, it will hide the tables in HTML comments. [You can still parse them, though.](http://stackoverflow.com/questions/40665907/web-scraping-data-table-with-r-rvest?noredirect=1&lq=1) – alistaire Dec 15 '16 at 05:09
That's what I'm looking for, Thanks a lot! @alistaire – jc127 Dec 15 '16 at 05:53

unable to retrieve all tables while scraping a webpage with R

0 Answers0