4

I would like to use a data.frame returning function from dplyr to read in data from Excel files whose location I want to easily configure. Here I'm presenting my problem using a simplified get_table() function and two generated data.frames. In reality the get_table() function fetches data from a server and parses it.

When calling the function from dplyr all data.frame results should be combined. Here is the simplified code:

files <- read.table(header=T, text='
  type      filename
  A         A_table
  B         B_table
')

A_table <- read.table(header=T, text='
  area      observations
  K1        5
  K2        9
')

B_table <- read.table(header=T, text='
  area      observations
  K1        23
  K2        28
  K3        1
')

get_table <- function(name) {
  return(get(name))
}

I can read the files with lapply:

list <- as.vector(files[,2])
t <- lapply(list, get_table)
do.call("rbind", t)

And combine the results into:

  area observations
1   K1            5
2   K2            9
3   K1           23
4   K2           28
5   K3            1

I would however want to lear to do the same in dplyr-style doing something like this (but working - this doesn't):

files %>%
  select(filename) %>%
    rowwise() %>% 
      get_table()
pe3
  • 2,301
  • 1
  • 14
  • 11
  • 1
    You can just do `mget(as.character(files$filename))` – akrun Dec 15 '14 at 02:42
  • 1
    `filename` is a factor, so it would be easier if you converted to character first – Rich Scriven Dec 15 '14 at 02:53
  • 1
    `t %>% rbind_all()` if you insist on using dplyr – Khashaa Dec 15 '14 at 03:43
  • 1
    The `get_table` function is pointless - it doesn't do anything but run `get`. – thelatemail Dec 15 '14 at 04:11
  • I was unclear. The get_table-function is in reality more complex. It fetches Excel data from a server and then parses it. But the end result is a data.frame. So my problem is not getting away with get_table function but how to use it as part of a dplyr-sequence. – pe3 Dec 15 '14 at 06:36

1 Answers1

2

As noted by @Richard Scriven, filename should be character.

files <- read.table(header=T, stringsAsFactors=FALSE, text='
  type      filename
  A         A_table
  B         B_table
')

Applying do to the last line of your code achieves the same result as lapply(files[ ,2], get) %>% rbind_all.

files %>%
  rowwise() %>% 
  do(get_table(.$filename))

#Groups: <by row>

#  area observations
#1   K1            5
#2   K2            9
#3   K1           23
#4   K2           28
#5   K3            1
Khashaa
  • 7,293
  • 2
  • 21
  • 37
  • The automatic craetion of factors has been identified as one of the biggest gotchas in R. It played a role in this problem http://stackoverflow.com/a/1535373/1792999 – pe3 Dec 15 '14 at 20:08