2

I have a list with 3 elements, each with a different set, and number, of values. I would like to turn this list into a simple two column dataframe.

One column would be the value from the list element, the second column would be the name of the list element itself.

myList <- list(A = c(1,2,3),
               B = c(10,20,30,40),
               C = c(100,200,300,400,500))

So the ideal outcome is something like:

Value     List
1         A
2         A
10        B
100       C
......

So I know I can do this with a series of rbinds:

df <-   data.frame(Value = myList[[A]],cluster = A) %>%
  rbind(data.frame(Value = myList[[B]],cluster = B)) %>%
  rbind(data.frame(Value = myList[[C]],cluster = C))

And I can probably clean this up with a loop or lapply...but it seems like there should be a more straightforward way to get this!

Any help would be greatly appreciated.

Max F
  • 123
  • 5

3 Answers3

5

We can use stack from base R

stack(myList)

-output

   values ind
1       1   A
2       2   A
3       3   A
4      10   B
5      20   B
6      30   B
7      40   B
8     100   C
9     200   C
10    300   C
11    400   C
12    500   C
akrun
  • 874,273
  • 37
  • 540
  • 662
  • oh I didn't even know about this function! This is perfect, thank you. – Max F Aug 20 '21 at 14:42
  • When I try this, I get an error: Error in data.frame(values = unlist(unname(x)), ind, stringsAsFactors = FALSE) : arguments imply differing number of rows: 41, 0 – Max F Aug 20 '21 at 15:42
  • @MaxF No idea about the issue. This is basedon your example and it works fine for me – akrun Aug 20 '21 at 16:28
  • @MaxF Also, please note that `stack` function can be found in other packages as well. So, you may have to check whether `stack` from other packages masked the base R stack function – akrun Aug 20 '21 at 17:14
  • I think my problem is that the actual list I'm using in my actual code does not have named elements upon generation. Once I name the elements, it's fine. I think I need to go back and figure out how to automatically name the list elements (it's fine if the names are just "1", "2"...) – Max F Aug 20 '21 at 20:42
  • @MaxF It is easier with `stack(setNames(myList, seq_along(myList)))` – akrun Aug 20 '21 at 20:43
  • 1
    oh, that's cleaner. I did it with a for loop. Thank you!! – Max F Aug 20 '21 at 20:52
4

If you want to use tidyverse (not sure it can be done just with dplyr), you can use

library(magrittr)
tibble::enframe(myList) %>% tidyr::unnest(cols = value)

output

# A tibble: 12 x 2
   name  value
   <chr> <dbl>
 1 A         1
 2 A         2
 3 A         3
 4 B        10
 5 B        20
 6 B        30
 7 B        40
 8 C       100
 9 C       200
10 C       300
11 C       400
12 C       500

First, tibble::enframe(myList) will return a tibble with two columns and three rows. Column name will be the name of each element in your original list, and value will itself be the data.frames each containing a column with the values in each list.

Then, tidyr::unnest(cols = value) just unnests the value column.


That said, I do encourage you to consider @akrun's answer as utils::stack(myList) is considerably faster, and less verbose.

(edited to add @Martin Gal's approach using purrr)

microbenchmark::microbenchmark(
   tidyverse = tibble::enframe(myList) %>% tidyr::unnest(cols = value),
   baseR = utils::stack(myList),
   purrr = purrr::map_df(myList, ~data.frame(value = .x), .id = "id"),
   times = 10000
)

output

Unit: microseconds
     expr      min       lq      mean    median        uq       max neval
 tidyverse 1937.067 2169.251 2600.4402 2301.1385 2592.7305 77715.238 10000
     baseR  144.218  182.112  227.6124  202.0755  230.0960  5476.169 10000
     purrr  350.265  417.803  523.7954  455.4410  520.3555 71673.820 10000
Daniel
  • 1,005
  • 1
  • 16
  • 22
3

One option using purrr:

library(purrr)

map_df(myList, ~data.frame(value = .x), .id = "id")

returns

   id value
1   A     1
2   A     2
3   A     3
4   B    10
5   B    20
6   B    30
7   B    40
8   C   100
9   C   200
10  C   300
11  C   400
12  C   500
Martin Gal
  • 16,640
  • 5
  • 21
  • 39