1

I am trying to take the names of list elements and use do() to apply a function over them all, then bind them in a single data frame.

require(XML)
require(magrittr)

url <- "http://gd2.mlb.com/components/game/mlb/year_2016/month_05/day_21/gid_2016_05_21_milmlb_nynmlb_1/boxscore.xml"

box <- xmlParse(url)

xml_data <- xmlToList(box)

end <- length(xml_data[[2]]) - 1

x <- seq(1:end)

away_pitchers_names <- paste0("xml_data[[2]][", x, "]")
away_pitchers_names <- as.data.frame(away_pitchers_names)
names(away_pitchers_names) <- "elements"
away_pitchers_names$elements %<>% as.character()

listTodf <- function(x) {
  df <- as.data.frame(x)
  tdf <- as.data.frame(t(df))
  row.names(tdf) <- NULL
  tdf
}

test <- away_pitchers_names %>% group_by(elements) %>% do(listTodf(.$elements))

When I run the listTodf function on a list element it works fine:

listTodf(xml_data[[2]][1]

      id   name name_display_first_last pos out bf er r h so hr bb np  s w l sv bs hld s_ip s_h s_r s_er s_bb
1 605200 Davies             Zach Davies   P  16 22  4 4 5  5  2  2 86 51 1 3  0  0   0 36.0  41  24   23   15
  s_so game_score  era
1   25         45 5.75

But when I try to loop through the names of the elements with the do() function I get the following:

Warning message: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character

And here is the output:

> test
Source: local data frame [5 x 2]
Groups: elements [5]

          elements               V1
             (chr)            (chr)
1 xml_data[[2]][1] xml_data[[2]][1]
2 xml_data[[2]][2] xml_data[[2]][2]
3 xml_data[[2]][3] xml_data[[2]][3]
4 xml_data[[2]][4] xml_data[[2]][4]
5 xml_data[[2]][5] xml_data[[2]][5]

I am sure it is something extremely simple, but I can't figure out where things are getting tripped up.

BillPetti
  • 511
  • 2
  • 7
  • 14
  • Could you clarify why you're transposing the grouping variable by the same grouping variable and then combining the whole thing as a data.frame? Please elaborate on what exactly you're trying to do with a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – shrgm May 22 '16 at 12:29
  • Each element of the list has the same variables, they just represent different cases. So I am trying to take each element, combine them together, and then display them in wide form (since transforming a list element to a data frame displays in long form). – BillPetti May 22 '16 at 12:43
  • If you are evaluating the strings, use `eval(parse(..` i.e. `lapply(away_pitchers_names$elements, function(x) listTodf(eval(parse(text=x))))` Also, note that in the `listTodf` function the `as.data.frame` is called without `stringsAsFactors=FALSE`, so all the character columns will be by default `factor` class which will result in the warning mentioned in your post. – akrun May 22 '16 at 12:55

1 Answers1

1

For evaluating the strings, eval(parse can be used

library(dplyr)
lapply(away_pitchers_names$elements,
      function(x) as.data.frame.list(eval(parse(text=x))[[1]], stringsAsFactors=FALSE)) %>%
                bind_rows()
#      id      name name_display_first_last pos out bf er r h so hr bb np  s w l
#1 605200    Davies             Zach Davies   P  16 22  4 4 5  5  2  2 86 51 1 3
#2 430641     Boyer            Blaine Boyer   P   2  4  0 0 2  0  0  0  8  7 1 0
#3 448614 Torres, C           Carlos Torres   P   3  4  0 0 0  1  0  2 21 11 0 1
#4 592804 Thornburg         Tyler Thornburg   P   3  3  0 0 0  1  0  0 14  8 2 1
#5 518468    Blazek          Michael Blazek   P   1  5  1 1 2  0  0  2 23 10 1 1
#  sv bs hld s_ip s_h s_r s_er s_bb s_so game_score  era loss     note
#1  0  0   0 36.0  41  24   23   15   25         45 5.75 <NA>     <NA>
#2  0  1   0 21.1  22   4    4    5    7         48 1.69 <NA>     <NA>
#3  0  0   2 22.1  22   9    9   14   21         52 3.63 <NA>     <NA>
#4  1  2   8 18.2  13   8    8    7   29         54 3.86 <NA>     <NA>
#5  0  1   8 21.1  23   6    6   14   18         41 2.53 true (L, 1-1)

However, it is easier and faster to just do

lapply(xml_data[[2]][1:5], function(x) 
       as.data.frame.list(x, stringsAsFactors=FALSE)) %>%
                  bind_rows()
akrun
  • 874,273
  • 37
  • 540
  • 662