R - Extracting Tables From Websites Using XML Package

Question

I am trying to replicate the method used in a previous answer here Scraping html tables into R data frames using the XML package for my own work but cannot get the data to extract. The website I am using is: http://www.footballfanalytics.com/articles/football/euro_super_league_table.html

I just wish to extract a table of each team name and their current rating score. My code is as follows:

library(XML)
theurl <-  "http://www.footballfanalytics.com/articles/football/euro_super_league_table.html"
tables <- readHTMLTable(theurl)
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
tables[[which.max(n.rows)]]

This produces the error message

Error in tables[[which.max(n.rows)]] : 
attempt to select less than one element

Could anyone suggest a solution please? Is there something in this particular site causing this not to work? Or is there a better alternative method I can try? Thanks

score 1 · Accepted Answer · answered Apr 28 '14 at 09:25

Seems as if the data is loaded via javascript. Try:

library(XML)
theurl <- "http://www.footballfanalytics.com/xml/esl/esl.xml"
doc <- xmlParse(theurl)
cbind(team = xpathSApply(doc, "/StatsData/Teams/Team/Name", xmlValue),
      points = xpathSApply(doc, "/StatsData/Teams/Team/Points", xmlValue))

R - Extracting Tables From Websites Using XML Package

1 Answers1