creating columns in nested tibble if column does not exist

Question

I am trying to extract data from a nested tibble. Within the outer tibble, not all tibbles may exist or be complete. In case of an non-existing column I would like to return 0.

df <- tibble(a = tibble(iris),
             b = tibble(iris[1:2]),
             c = NULL)

now I'd like to extract the column 'species' from each nested tibble, where the generated column is filled with NA if no data are available. So that the result equals:

tibble(a_s = iris$Species, 
       b_s = NA, 
       c_s = NA)

Is there any way I could achieve this?

I naively tried:

transmute(df, a_s = a$species,
              b_s = b$species,
              c_s = c$species)

which of course only works for a_s, generates a warning for b_s and throws an error for c_s.

I have tried creating a helper function to evaluate the existence of each column, but this didn't work for nested dataframes. Any ideas on how to solve this?

UPDATE: for clarity, I always want to generate the output as specified, while tibble c may or may not be there.

can you confirm if `c` is supposed to be nested under `b` or if `c` is supposed to be the third column? — yake84, May 10 '23 at 20:39
oops, that is a mistake. I have edited the question to correct this — Joost Keuskamp, May 10 '23 at 21:02

Andre Wildberg · Accepted Answer · 2023-05-10T21:29:11.520

1

Using grepl within ifelse to check for Species and do.call to get the final tibble.

library(dplyr)

do.call(tibble, sapply(c("a", "b", "c"), function(x)
  ifelse(any(grepl("Species", names(df[[x]]))), 
         df[[x]]["Species"], 
         NA_character_))) %>% 
  rename_with(~ paste0(.x, "_s"))
# A tibble: 150 × 3
   a_s    b_s   c_s  
   <fct>  <chr> <chr>
 1 setosa NA    NA   
 2 setosa NA    NA   
 3 setosa NA    NA   
 4 setosa NA    NA   
 5 setosa NA    NA   
 6 setosa NA    NA   
 7 setosa NA    NA   
 8 setosa NA    NA   
 9 setosa NA    NA   
10 setosa NA    NA   
# … with 140 more rows
# ℹ Use `print(n = ...)` to see more rows

edited May 10 '23 at 21:29

answered May 10 '23 at 19:34

Andre Wildberg

12,344
3
12
29

Unfortunately this would not work for my situation. I am looking for a solution which is robust for cases where c is not there, which is why I made it to be NULL. – Joost Keuskamp May 10 '23 at 21:01
I made some changes. See if that works now. – Andre Wildberg May 10 '23 at 21:17

creating columns in nested tibble if column does not exist

1 Answers1