A simple question but I've searched for a solution, and so far to no avail.
Say that I have a list object, and I want to pull specific list elements and output them side-by-side as dataframe columns. How can I achieve this with tidyverse/piping in a simple way? Attempt to solve it below.
Data
some_data <-
structure(list(x = c(23.7, 23.41, 23.87, 24.18, 24.15, 24.31,
23.14, 23.72, 24.12, 23.47, 23.59, 23.29, 23.24, 23.5, 23.56,
23.16, 23.62, 23.67, 23.84, 23.69, 23.7, 23.68, 24.2, 23.77,
23.74, 23.64, 24.39, 24.05, 24.51, 23.6, 24.29, 23.31, 23.96,
24.07, 24.37, 23.77, 23.64, 24, 23.68, 24.02, 23.36, 23.54, 23.34,
23.69, 23.79, 23.8, 23.7, 24.45, 23.27, 23.57, 23.02, 24.23,
23.41, 23.6, 24.02, 23.94, 24.06, 23.97, 23.38, 23.46, 24, 23.89,
23.51, 23.72, 23.83, 23.96, 23.84, 23.52, 24.36, 23.94, 23.82,
24.04, 24.05, 23.6, 23.52, 24.13, 23.43, 23.33, 24.01, 23.99,
24.46, 24.23, 24.19, 23.83, 23.8, 23.93, 23.79, 23.48, 23.26,
24.04, 23.93, 23.98, 23.86, 23.49, 24.17, 23.7, 23.54, 23.55,
23.67, 23.66)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -100L), spec = structure(list(cols = list(
x = structure(list(), class = c("collector_double", "collector"
))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
I want the value output of the `hist()` function for this data
library(tidyverse)
some_data$x %>%
as.numeric() %>%
hist(breaks = seq(from = 23, to = 24.6, by = 0.2),
plot = FALSE)
## $breaks
## [1] 23.0 23.2 23.4 23.6 23.8 24.0 24.2 24.4 24.6
## $counts
## [1] 3 9 20 23 19 16 7 3
## $density
## [1] 0.15 0.45 1.00 1.15 0.95 0.80 0.35 0.15
## $mids
## [1] 23.1 23.3 23.5 23.7 23.9 24.1 24.3 24.5
## $xname
## [1] "."
## $equidist
## [1] TRUE
## attr(,"class")
## [1] "histogram"
So let's say that I want both `$breaks` and `$counts` side by side as a data frame
I will supplement the original pipe so that:
some_data$x %>%
as.numeric() %>%
hist(breaks = seq(from = 23, to = 24.6, by = 0.2),
plot = FALSE) %>%
##
map_df(~.[1:30]) %>%
select(bins = breaks,
frequency = counts)
##
## # A tibble: 30 x 2
## bins frequency
## <dbl> <int>
## 1 23 3
## 2 23.2 9
## 3 23.4 20
## 4 23.6 23
## 5 23.8 19
## 6 24 16
## 7 24.2 7
## 8 24.4 3
## 9 24.6 NA
## 10 NA NA
## # ... with 20 more rows
So yes, it does work, but in map_df()
I had to put a relatively large "magic" number (arbitrarily I put 30) to ensure all data is included. Is there a simpler way to get $breaks
and $counts
as a dataframe? Maybe even with just one step instead of combining map_df()
and then select()
?
COMMENT
While this specific problem demonstrated the case of a histogram
class, my general question isn't about histograms, but principle about list objects. The nice thing about the output of hist(plot = FALSE)
is that it generates an object with unequal-length elements, which is a demonstration of a problem that needs a flexible solution to account for the variance in element length.
SOLUTION
Based on Rémi Coulaud's (chosen) solution below, the way to address the situation of unequal lengths of list elements is to make them equal, anchoring to the lengthiest element. Then, it's not a problem anymore. The working pipe is as follows:
library(tidyverse)
some_data$x %>%
as.numeric() %>%
hist(breaks = seq(from = 23, to = 24.6, by = 0.2),
plot = FALSE) %>%
lapply(., `length<-`, max(lengths(.))) %>% ## make all elements as the length of the longest one
map_df(~.) %>%
select(bins = breaks,
frequency = counts)
Thanks!