summarize()
can give you exactly this, especially if all the stats you want are computed within groups defined by one variable, i.e. Sample
:
library(raster)
#> Loading required package: sp
library(tidyverse)
data <- tribble(
~rowname, ~Sample, ~Phagocytic_Score,
1, 1232, 24030,
2, 1232, 11040,
3, 4321, 7266,
4, 4321, 4096,
5, 5631, 7383,
6, 5631, 21507
)
data %>%
group_by(Sample) %>%
summarize(
mean = mean(Phagocytic_Score),
sd = sd(Phagocytic_Score),
pct_cv = cv(Phagocytic_Score)
)
#> # A tibble: 3 x 4
#> Sample mean sd pct_cv
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1232 17535 9185. 52.4
#> 2 4321 5681 2242. 39.5
#> 3 5631 14445 9987. 69.1
We've got some repeating going on, though, don't we? Each variable is defined as a function call with the same input variable. summarize_at()
is more appropriate, then:
data %>%
group_by(Sample) %>%
summarize_at("Phagocytic_Score",
list(mean = mean, sd = sd, cv = cv))
#> # A tibble: 3 x 4
#> Sample mean sd cv
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1232 17535 9185. 52.4
#> 2 4321 5681 2242. 39.5
#> 3 5631 14445 9987. 69.1
Ah, but there's still some more room for improvement. Why are we repeating the names of the functions as the names of the variables, since they're the same? Well, mget()
will take a single vector of the function names we want, and return a named list of those functions, with the names as those function names:
data %>%
group_by(Sample) %>%
summarize_at("Phagocytic_Score",
mget(c("mean", "sd", "cv"), inherits = TRUE))
#> # A tibble: 3 x 4
#> Sample mean sd cv
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1232 17535 9185. 52.4
#> 2 4321 5681 2242. 39.5
#> 3 5631 14445 9987. 69.1
Note we need inherits = TRUE
for the reason explained here.
Created on 2019-10-22 by the reprex package (v0.3.0)