10

I have the following data frame that looks like this (3 columns as list).

A tibble: 14 x 4
                                                    clinic_name drop_in_hours appointment_hours   services
                                                          <chr>        <list>            <list>     <list>
     1                   Birth Control and Sexual Health Centre    <list [1]>        <list [1]> <list [1]>
     2 Black Creek Community Health Centre (Sheridan Mall Site)    <list [1]>        <list [1]> <list [1]>
     3 Black Creek Community Health Centre (Yorkgate mall Site)    <list [1]>        <list [1]> <list [1]>
     4                                         Crossways Clinic    <list [1]>        <list [1]> <list [1]>
     5                                       Hassle Free Clinic    <list [1]>        <list [1]> <list [1]>
     6                          Immigrant Women's Health Center    <list [1]>        <list [1]> <list [1]>
     7                          Rexdale Community Health Center    <list [1]>        <list [1]> <list [1]>
     8                            Rexdale Youth Resource Center    <list [1]>        <list [1]> <list [1]>
     9                         Scarborough Sexual Health Clinic    <list [1]>        <list [1]> <list [1]>
    10                                 Special Treatment Clinic    <list [1]>        <list [1]> <list [1]>
    11                            Taibu Community Health Center    <list [1]>        <list [1]> <list [1]>
    12                                                 The Gate    <list [1]>        <list [1]> <list [1]>
    13                                   The Jane Street Clinic    <list [1]>        <list [1]> <list [1]>
    14                                            The Talk Shop    <list [1]>        <list [1]> <list [1]>

I wanted to output it to as csv file. It has come to my attention that columns of data frames shouldn't be lists in R. So I did some google and found this save data.frames with list-column so I tried it out :

library(tidyverse)

df %>% 
  mutate(drop_in_hours = map_chr(drop_in_hours, ~ capture.output(dput(.))),
         appointment_hours = map_chr(appointment_hours, ~ capture.output(dput(.))),
         services = map_chr(services, ~ capture.output(dput(.)))     ) %>% 
  write_csv("health.csv")

But I got an error , am I missing something here?

Error in mutate_impl(.data, dots) : 
  Evaluation error: Result 4 is not a length 1 atomic vector

.

Ann
  • 328
  • 1
  • 4
  • 14

6 Answers6

17

Create a tibble containing list columns:

library(tibble)

clinic_name <- c('bobo center', 'yoyo plaza', 'lolo market')
drop_in_hours <- list(c("Monday: 2 pm - 5 pm", "Tuesday: 4 pm - 7 pm")) 
appointment_hours <- list(c("Monday: 1 pm - 2 pm", "Tuesday: 2 pm - 3 pm")) 
services <- list(c("skin graft", "chicken heart replacement"))

tibb <- data_frame(clinic_name, drop_in_hours, appointment_hours, services)

print(tibb)

enter image description here

Write a general-purpose function that converts any list columns to character type:

set_lists_to_chars <- function(x) {
    if(class(x) == 'list') {
    y <- paste(unlist(x[1]), sep='', collapse=', ')
    } else {
    y <- x 
    }
    return(y)
}

Apply function to tibble with list columns:

new_frame <- data.frame(lapply(tibb, set_lists_to_chars), stringsAsFactors = F)

new_frame

enter image description here

Write newly formatted dataframe as csv file:

write.csv(new_frame, file='Desktop/clinics.csv')

enter image description here

This is a csv file with the list columns expanded as regular strings.

Here is an all-encompassing function. Just pass in your tibble and a filename:

tibble_with_lists_to_csv <- function(tibble_object, file_path_name) {
    set_lists_to_chars <- function(x) { 
        if(class(x) == 'list') { y <- paste(unlist(x[1]), sep='', collapse=', ') } else { y <- x  } 
        return(y) }
    new_frame <- data.frame(lapply(tibble_object, set_lists_to_chars), stringsAsFactors = F)
    write.csv(new_frame, file=file_path_name)
}

Usage:

tibble_with_lists_to_csv(tibb, '~/Desktop/tibb.csv')
Cybernetic
  • 12,628
  • 16
  • 93
  • 132
  • Thanks, this was great! I think that the "all-encompassing function" has `tibb` where `tibble_object` is required. – Fons MA Jun 11 '19 at 07:23
  • Thanks. tibb is a tibble object. – Cybernetic Jun 11 '19 at 08:38
  • yes, but you are calling `tibb` inside the function where it hasn't been defined. – Fons MA Jun 11 '19 at 13:10
  • Gotcha...fixed it. Thanks! – Cybernetic Jun 11 '19 at 13:26
  • This function doesn't seem to be rowwise, If the answers where different for bobo center and yoyo plaza, it would use the first line results in the entire tibble. Is it the same for you ? Any idea how to make it rowwise ? – Hugo Lehoux Oct 12 '21 at 08:23
  • As pointed by @HugoLehoux, the function does not evaluate by row. Here's a slight modification that works row-wise: `set_lists_to_chars <- function(x) { if(class(x) == 'list') { y <- sapply(seq(x), function (y) paste(unlist(x[y]), sep='', collapse=', ')) } else { y <- x } return(y) }` – Mak Nov 22 '21 at 17:01
11

Here's another option that may be a little simpler.

Depending on the data, comma separated values could get complicated, so I'm using a bar | for separating values in list columns:

library(tidyverse)

starwars %>% 
  rowwise() %>% 
  mutate_if(is.list, ~paste(unlist(.), collapse = '|')) %>% 
  write.csv('df_starwars.csv', row.names = FALSE)

starwars is one of the dplyr sample dataframes.

sbha
  • 9,802
  • 2
  • 74
  • 62
4

I had a similar dataframe with list columns that I wanted to save as csv. I figured out this method. As well as how to turn the columns back into lists.

library(tidyverse)

# create a df with a list column
df <- tibble(x=rep(1:5,each=2), y=LETTERS[1:10]) %>%
  group_by(x) %>%
  summarise(z=list(y))

# this throws an error
write_csv(df, "test.csv")

# convert the list column to a string
df2 <- df %>%
  group_by(x) %>% # where x==unique(x)
  mutate(z=paste(z))

# this works
write_csv(df2, "test.csv")

# read the csv
df3 <- read_csv("test.csv")

# reconstruct original df by parsing the strings
# https://stackoverflow.com/questions/1743698/evaluate-expression-given-as-a-string
df4 <- df3 %>%
  group_by(x) %>% 
  mutate(z=list(eval(parse(text=z))))
Simon Woodward
  • 1,946
  • 1
  • 16
  • 24
1

Is there any specific reason why you would like to save the columns as a list ? Alternatively, you can use unnest and save it in csv. example below

library(tidyverse)
df_list<-data_frame(abc = letters[1:3], lst = list(1:3, 1:3, 1:3))
df_list %>% unnest() %>% write.csv("list.csv")

further, when you read the file you can nest it back

df <- read.csv("list.csv")[ ,2:3]
df %>% nest(lst)
user5249203
  • 4,436
  • 1
  • 19
  • 45
  • I created list-columns for each variable that has multiple entries (for example, four entries for services, three entries for appointment_hours,etc). I wanted to collapse all the services for a each clinic into a list, in order to have a single row for each clinic, and a variable that is a list of all services it offers, instead of a row for every service, for every clinic. – Ann Dec 29 '17 at 18:29
  • as of 2021 I get error: 'cols' is now required when using unnest – userJT Jul 31 '21 at 01:47
1

exploratory::list_to_text() will convert a list column to a character column. The default is sep = ", ", which I recommend changing to something else if writing to a .csv.

devtools::install_github("exploratory-io/exploratory_func")

list_to_text <- function(column, sep = ", "){
  loadNamespace("stringr")
  ret <- sapply(column, function(x) {
    ret <- stringr::str_c(x, collapse = sep)
    if(identical(ret, character(0))){
      # if it's character(0)
      NA
    } else {
      ret
    }
  })
  as.character(ret)
}

https://github.com/exploratory-io/exploratory_func/blob/master/LICENSE.md

knapply
  • 647
  • 1
  • 5
  • 11
0

Following up on the example from @cybernetic, the following solution using dplyr::mutate_if works for me.

library(tibble)
library(dplyr)

clinic_name <- c('bobo center', 'yoyo plaza', 'lolo market')
drop_in_hours <- list(c("Monday: 2 pm - 5 pm", "Tuesday: 4 pm - 7 pm")) 
appointment_hours <- list(c("Monday: 1 pm - 2 pm", "Tuesday: 2 pm - 3 pm")) 
services <- list(c("skin graft", "chicken heart replacement"))

tibb <- data_frame(clinic_name, drop_in_hours, appointment_hours, services)

# unlist if column is list
tibb_unlisted <- tibb %>%
  rowwise() %>%
  dplyr::mutate_if(is.list, 
    funs(paste(unlist(.), sep='', collapse=', ')))
# print on screen, can see values
print(tibb_unlisted)
fan
  • 11
  • 6