0

I am a beginner in R Programming.

I would like to scrape football data from Squawka and place these in a dataframe in order to conduct analyses (newborn hobby of Football Analytics), more precisely from these kind of pages: http://eredivisie.squawka.com/willem-ii-vs-psv/10-08-2014/dutch-eredivisie/matches.

On Stack Overflow I found a thread about how to conduct this: how to scrape this squawka page?.

Unfortunately, when I implement the code (see below) that is given in the above-mentioned thread for processing XML attributes/data into a data frame, I receive the following error message:

"Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = default.stringsAsFactors()) : numbers of columns of arguments do not match”

data <- lapply(example, function(x){ 
  if(length(x['event']) > 0){
    res <- lapply(x['event'], function(y){
    matchAttrs <- as.list(xmlAttrs(y))
    matchAttrs$start <- xmlValue(y['start']$start)
    matchAttrs$end <- xmlValue(y['end']$end)
    matchAttrs
  })
  return(do.call(rbind.data.frame, res))
}
}
)

The outcome should be something similar like this:

player_id           mins secs minsec team type  start       end
event         531    4   39    279   44 Failed 73.1,87.1 97.9,49.1
event5        311    6   33    393   31 Failed 92.3,13.1 93.0,31.0
event1        376    8   57    537   31 Failed  97.7,6.1 96.7,16.4
event6        311   13   50    830   31 Failed  99.5,0.5 94.9,42.6
event11       311   14   11    851   31 Failed  99.5,0.5 93.1,51.0
event7        311   17   41   1061   31 Failed 99.5,99.5 92.6,50.1

I have tried several other solutions that I found on Stack Overflow that have dealt with similar situations, but till now I did not manage to come up with a proper solution.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • 1
    The elements probably don't all have the same number of attributes. You should decide which ones you want, and provide missing values when they're not given. – Nathan Werth Sep 01 '17 at 14:19
  • Thanks @NathanWerth. I have selected all the attributes in the element 'event': sapply(c("player_id","mins", "secs","minsec","team","injurytime_play","type","k", ), function(x) xpathSApply(example, '//event', xmlGetAttr, x)) With this code I receive the following error message: Error during wrapup: no applicable method for 'xpathApply' applied to an object of class "XMLNodeSet". – YasinTuncbilek Sep 03 '17 at 16:58

0 Answers0