1

I am working with the dplyr library and have created a dataframe in a pipe that looks something like this:

a <- c(1, 2, 2)
b <- c(3, 4, 4)
data <- data.frame(a, b)
data %>% summarize_all(c(min, max))

which gives me this dataframe:

a_fn1 b_fn1 a_fn2 b_fn2
    1     3     2     4

and I am trying to reshape this dataframe so that the output of the pipe stacks multiple columns on top of each other in several rows that look like this:

A  B
----
1  3
2  4

How would I go about this? I do not want to change how the functions are called because the summarize_all function helps me achieve the values I am looking for. I just want to know how to change this dataframe to the shape such that each value in each row is the value of the summarize function for the given column.

Anonymous
  • 93
  • 8
  • 2
    You should make a reproducible example, please read: https://stackoverflow.com/a/5963610/6574038 – jay.sf Jul 22 '21 at 08:54
  • Thanks. Hopefully the edits I made made this a more reproducable example. – Anonymous Jul 22 '21 at 09:01
  • 2
    I'm sorry, it is still unclear to me what your goal is. However, for reshaping dataframes with tidyverse, check out `tidyr::pivot_longer` and `tidyr::pivot_wider` – Edo Jul 22 '21 at 09:02
  • Pivot longer does not do what I want it to do because it makes column names into a single row. I just want to stack the result of the summarize function to individual rows for each function instead of a single row including all of the outputs of the functions – Anonymous Jul 22 '21 at 09:14
  • Related post, there is a dplyr/tidy solution, try it out: https://stackoverflow.com/q/46841179/680068 – zx8754 Jul 22 '21 at 09:18
  • before `summarize_all()` try doing a `group_by_all()` like so: `data %>% group_by_all() %>% summarize_all(c(min, max)) %>% ungroup()` – koolmees Jul 22 '21 at 09:22
  • group_by_all() just returns the function called on single rows which is not what I am trying to accomplish. – Anonymous Jul 22 '21 at 09:25
  • This doesn't exactly work because I need to feed parameters to the functions in the list. – Anonymous Jul 22 '21 at 09:32
  • Another related post, getting min max for each column: `apply(data, 2, range)` https://stackoverflow.com/q/26893178/680068 – zx8754 Jul 22 '21 at 09:35
  • the min and max functions are just placeholders for the functions I am trying to run. – Anonymous Jul 22 '21 at 09:36
  • `data %>% summarize_all(c(min, max)) %> pivot_longer(cols = everything(), names_to = '.value', names_pattern = '([a-z]+)_.*')` – Ronak Shah Jul 22 '21 at 09:56

1 Answers1

2

First, naming your functions in summarize_all() will make them appear in the result for easier wrangling.

Then, you can use pivot_longer() with the special .value sentinel in names_to to achieve what you want:

library(tidyverse)
a <- c(1, 2, 2)
b <- c(3, 4, 4)
data <- data.frame(a, b)
data %>% 
  summarize_all(c(min=min, max=max)) %>%
  pivot_longer(everything(), names_to=c(".value", "variable"), names_pattern="(.)_(.+)")
#> # A tibble: 2 x 3
#>   variable     a     b
#>   <chr>    <dbl> <dbl>
#> 1 min          1     3
#> 2 max          2     4

Created on 2021-07-22 by the reprex package (v2.0.0)

Depending on what output you want, you can even switch the order to c("variable", ".value").

Note that summarize_all() is deprecated and that you might want to use the new, more verbous syntax: summarize(across(everything(), c(min=min, max=max))).

Dan Chaltiel
  • 7,811
  • 5
  • 47
  • 92
  • actually in my real example I use summarize_if as I require the predicate to run the functions I need. – Anonymous Jul 22 '21 at 09:41
  • @Anonymous you should check the help of `?across`, it can handle predicates as well. – Dan Chaltiel Jul 22 '21 at 09:43
  • Noted. However for now the result of my summarize is the correct output. This solution however breaks apart the names of my columns and does not give me a solution as it outputs multiple rows for each function. – Anonymous Jul 22 '21 at 09:46
  • @Anonymous this solution gives the expected output that you wrote in your question (with the extra "variable" column that you may remove). If this is not what you want you need to explain it better and give us another expected output. This might even belong in another question, I'm not sure. – Dan Chaltiel Jul 22 '21 at 13:21