0

I have a data.frame(read in from jsonlite::fromJSON) which contains some list of data.frame, for example:

# > str(dat)
# 'data.frame': 2 obs. of  3 variables:
# $ id           : int  1 2
# $ name         : chr  "Julie" "Justin" 
# $ score        :List of 2
# ..$ : NULL
# ..$ :'data.frame':    5 obs. of  2 variables:
#   .. ..$ rid               : int  1 2 3 4 5
# .. ..$ math              : int  5 17 19 12 16

I want to convert the data into the following form:

# id   name score.rid score.math
#  1  Julie       NA        NA
#  2 Justin       1         5
#  2 Justin       2         17
#  2 Justin       3         19
#  2 Justin       4         12
#  2 Justin       5         16
velvetrock
  • 593
  • 1
  • 8
  • 18
  • 3
    Possible duplicate of [Combine (rbind) data frames and create column with name of original data frames](https://stackoverflow.com/questions/15162197/combine-rbind-data-frames-and-create-column-with-name-of-original-data-frames) – Tim Biegeleisen Jan 22 '18 at 02:39
  • Could you please show the `dput` of the example. May be with `tidyverse` `dat %>% mutate(score = map(score, ~if(is.null(.x)) tibble(rid = rep(NA, 5), math = rep(NA, 5)) else as_tibble(.x))) %>% unnest` could work. It is not clear why the row for 'Julie' should be the same – akrun Jan 22 '18 at 02:50
  • @TimBiegeleisen That post is about to rbind two data.frames, but my question is about to convert a data.frame which contains a list – velvetrock Jan 22 '18 at 02:52
  • @akrun that was my fault, I will update the question – velvetrock Jan 22 '18 at 02:54
  • Also, is it possible to have more than 2 nested datasets with different number of rows and in that case if there is a `NULL` dataset what will be the number of rows for that – akrun Jan 22 '18 at 02:55
  • with the updated post, you can remove the `rep` and instead use `tibble(rid = NA, math = NA)` i.e. `dat %>% mutate(score = map(score, ~if(is.null(.x)) tibble(rid = NA, math = NA) else as_tibble(.x))) %>% unnest` – akrun Jan 22 '18 at 02:57
  • Or to make it more dynamic `dat %>% mutate(score = map(score, ~ if(is.null(.x)) tibble(NA) else as_tibble(.x))) %>% unnest %>% select(-`NA`)` The `NA` within select is within backquotes – akrun Jan 22 '18 at 03:01
  • @akrun The function `map`belongs to which package? – velvetrock Jan 22 '18 at 03:11
  • @velvetrock It belong to `purrr` – akrun Jan 22 '18 at 05:51

1 Answers1

0

For score values that are NULL, change it to NA using map from purrr, then convert to tibble and unnest

library(dplyr)
library(purrr)
library(tidyr)
dat %>%
   mutate(score = map(score, ~ if(is.null(.x)) tibble(NA) else as_tibble(.x))) %>% 
   unnest %>%
   select(-`NA`)
#  id   name rid math
#1  1  Julie  NA   NA
#2  2 Justin   1    5
#3  2 Justin   2   17
#4  2 Justin   3   19
#5  2 Justin   4   12
#6  2 Justin   5   16

data

dat <- data.frame(id = 1:2, name = c("Julie", "Justin"),
      score = I(list(NULL, data.frame(rid = 1:5, 
      math = c(5, 17, 19, 12, 16)))), stringsAsFactors = FALSE)
akrun
  • 874,273
  • 37
  • 540
  • 662