I'm seeking a dplyr-ish solution to the following task. I have a data frame that contains a variable that is a list of lists which has an attribute dimnames. The lists are of different lengths. Here's the output to str(df)
:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 3 obs. of 2 variables:
$ Step : int 1 2 3
$ Value:List of 3
..$ : num [1:2, 1:2] 0.232 0.261 0.932 0.875
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr "4" "5"
.. .. ..$ : chr "0.2" "0.094"
..$ : num [1:2, 1:5] 0.197 0.197 0.64 0.643 0.958 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr "4" "5"
.. .. ..$ : chr "0.2" "0.094" "0.044" "0.021" ...
..$ : num [1:2, 1] 0.268 0.262
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr "4" "5"
.. .. ..$ : chr "0.2"
I've included dput code below to recreate this dataframe.
I want a dataframe in the following format:
Step Value a b
1 0.232 4 0.200
1 0.261 5 0.200
1 0.932 4 0.094
1 0.875 5 0.094
1 NA 4 0.044
1 NA 5 0.044
1 NA 4 0.021
1 NA 5 0.021
1 NA 4 0.010
1 NA 5 0.010
2 0.197 4 0.200
2 0.197 5 0.200
2 0.640 4 0.094
2 0.643 5 0.094
2 0.958 4 0.044
2 1.032 5 0.044
2 0.943 4 0.021
2 1.119 5 0.021
2 0.943 4 0.010
2 1.119 5 0.010
3 0.268 4 0.200
3 0.262 5 0.200
3 NA 4 0.094
3 NA 5 0.094
3 NA 4 0.044
3 NA 5 0.044
3 NA 4 0.021
3 NA 5 0.021
3 NA 4 0.010
3 NA 5 0.010
where the variable a
are the row names of the list of lists dimnames and b
are the column names.
I've tried a for
loop to separate out each list by step, but
I've not been successful in padding out the list with
NA
s (length(x) <- y
doesn't work).I've reviewed advanced R data types but haven't been successful in extracting the dimnames into vectors to use as dataframe columns (
attr(df$Value, "dimnames")
yieldsNULL
.)
Once I have lists of the same length I can construct the new dataframe vectors step by step in the for
loop and then rbind. Or is there a way to use the dimname attribute to directly construct a wide dataframe using both row and column dimnames as dataframe column names? I can then gather
to make a long dataframe.
There's several subquestions here, and I'm sure there's a more elegant solution than the one I've mapped out. Thanks for looking.
Here's the dput code to create the dataframe:
df <- structure(list(Step = c(1L, 2L, 3L), Value = list(structure(c(0.232,
0.261, 0.932, 0.875), .Dim = c(2L,
2L), .Dimnames = list(c("4", "5"), c("0.2", "0.094"
))), structure(c(0.197, 0.197, 0.640,
0.643, 0.958, 1.032, 0.943,
1.119, 0.943, 1.119), .Dim = c(2L,
5L), .Dimnames = list(c("4", "5"), c("0.2", "0.094",
"0.044", "0.021", "0.01"))), structure(c(0.268,
0.262), .Dim = c(2L, 1L), .Dimnames = list(c("4",
"5"), "0.2")))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-3L), .Names = c("Step", "Value"))