0

I am having a list that looks like this:

City | Country | TrainArrivals  
A | country_1 | 8.00, 9.30, 10.00, 15.15  
B | country_1 | 11.00, 12.30, 18.00, 22.20, 22.50  
C | country_2 | 8.10, 11.20, 13.00, 16.40, 19.20, 23.00 

So it is all saved as a list (called data). Here I have to point out that data$TrainArrivals is also of type list and from different lengths.

I have tried looking for some solutions like this one. Or calling this line:

capture.output(summary(data), file = paste(path, "values.csv", sep = "/"))    

but the .csv file didn't have the data, but instead infromation of which type, length is every column.

I tried calling this line: do.call("rbind", lapply(data, as.data.frame)) and I got the following error

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows:

So, does anyone have an idea how could I solve the problem?

EDIT So the output from dput(data)

    structure(list(scenario = "first", pr = "all", rep = "2", 
    plot_data = list(c(81677L, 91437L, 233376L, 71580L, 43126L, 
    28724L, 15453L, 11162L, 8355L, 6786L, 5756L, 5162L, 4473L, 
    3848L, 3617L, 3331L, 2941L, 2572L, 2289L, 1974L, 1797L, 1575L, 
    1325L, 1217L, 1012L, 886L, 787L, 709L, 548L, 409L, 399L, 
    339L, 292L, 215L, 128L, 113L, 83L, 61L, 42L, 30L, 18L, 15L, 
    6L, 12L, 4L, 1L, 0L, 1L, 1L, 0L, 1L))), .Names = c("first", 
"pr", "rep", "plot_data"), row.names = c(NA, -1L), groups = structure(list(
    scenario = "first", pr = "all", .rows = structure(list(
        1L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", 
    "list"))), .Names = c("scenario", "pr", ".rows"), row.names = 1L, class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

Desired output

City; Country; trainArrivals;  
A;country_1;8.00, 9.30, 10.00, 15.15;
B;country_1;11.00, 12.30, 18.00, 22.20, 22.50;  
C;country_2;8.10, 11.20, 13.00, 16.40, 19.20, 23.00;
CroatiaHR
  • 615
  • 6
  • 24
  • Is that a `list`, or a `data.frame` with a list-column? It may help immensely if you provided the output from `dput(data)` here. – r2evans Oct 07 '20 at 17:29
  • @r2evans I have the problem that I have made a simple example of my real problem. when I try to post the dput of the orgiinal problem I get "Body is limited to 30000 characters; you entered 36646." – CroatiaHR Oct 07 '20 at 17:40
  • `dput(head(data,3))` or `dput(data[1:3,1:4])`? We don't need all data, we really need *just enough* to get the point across. :-) – r2evans Oct 07 '20 at 17:44
  • okay I did dput(data[3,]), I think that should do the trick – CroatiaHR Oct 07 '20 at 17:47

1 Answers1

1

Updated for the newer data.

You've formatted it in the question like a data.frame with a list-column, so I'll go off of that.

A couple options:

  1. Store as json, so that any language immediately gets the correct structure:

    writeLines(jsonlite::toJSON(dat), "dat.json")
    str( jsonlite::read_json("dat.json", simplifyDataFrame = TRUE) )
    # 'data.frame': 1 obs. of  4 variables:
    #  $ first    : chr "first"
    #  $ pr       : chr "all"
    #  $ rep      : chr "2"
    #  $ plot_data:List of 1
    #   ..$ : int  81677 91437 233376 71580 43126 28724 15453 11162 8355 6786 ...
    
  2. Collapse the list-column into something easily undone. I'll use collapse="," here, though you can use any character known to not be in the data. (I find "," to be intuitive for other users.)

    Note that this modifies your data in-place, so if you do this, you'll either want to do it on a temporary copy of it, or you'll need to manually undo it on your real data.

    To distinguish the nested list separator from the normal tabular field separator, I'll use write.table(., sep="|"), as much for visual here as anything. Note that as long as you have normal quoting, you can use "," for both and it will parse correctly ... though it'll be a little more difficult for the eye to see the distinction.

    dat$plot_data <- sapply(dat$plot_data, paste, collapse = ",")
    write.table(dat, "dat.txt", sep = ";", row.names = FALSE, quote = FALSE)
    invisible(sapply(readLines("dat.txt"), cat, "\n"))
    # first;pr;rep;plot_data 
    # first;all;2;81677,91437,233376,71580,43126,28724,15453,11162,8355,6786,5756,5162,4473,3848,3617,3331,2941,2572,2289,1974,1797,1575,1325,1217,1012,886,787,709,548,409,399,339,292,215,128,113,83,61,42,30,18,15,6,12,4,1,0,1,1,0,1 
    
    newdat <- read.table("dat.txt", header = TRUE, sep = ";")
    newdat$plot_data <- lapply(strsplit(newdat$plot_data, "[,[:space:]]+"), as.integer)
    str(newdat)
    # 'data.frame': 1 obs. of  4 variables:
    #  $ first    : chr "first"
    #  $ pr       : chr "all"
    #  $ rep      : int 2
    #  $ plot_data:List of 1
    #   ..$ : int  81677 91437 233376 71580 43126 28724 15453 11162 8355 6786 ...
    
r2evans
  • 141,215
  • 6
  • 77
  • 149