I have data in this format, where samples are in groups (in this example A or B), have a numerical quantity and a quality score (which is a factor).
I would like to summarise
the qual_score
by each group_name
.
Example Data:
group_name <- rep(c("A","B"),5)
qual_score <- c(rep("POOR",4),rep("FAIR",1),rep("GOOD",5))
quantity <- 5:14
df <- data.frame(group_name, qual_score, quantity)
> df
group_name qual_score quantity
1 A POOR 5
2 B POOR 6
3 A POOR 7
4 B POOR 8
5 A FAIR 9
6 B FAIR 10
7 A GOOD 11
8 B GOOD 12
9 A GOOD 13
10 B GOOD 14
Desired Output:
desired_output <- data.frame(c("2","2"),c("1","0"),c("2","3"))
colnames(desired_output) <- c("POOR", "FAIR", "GOOD")
rownames(desired_output) <- c("A", "B")
desired_output
POOR FAIR GOOD
A 2 1 2
B 2 0 3
I can do summary()
of qual_score
for the entire dataframe:
> summary(df$qual_score)
FAIR GOOD POOR
2 4 4
And can group_by()
to summarise mean(quantity) according to each group:
> df %>%
+ group_by(group_name) %>%
+ summarise(mean(quantity))
# A tibble: 2 x 2
group_name `mean(quantity)`
<fct> <dbl>
1 A 9
2 B 10
But when I try to use group_by() with summary() I get a warning and the following output:
> df %>%
+ group_by(group_name) %>%
+ summary(qual_score)
group_name qual_score quantity
A:5 FAIR:2 Min. : 5.00
B:5 GOOD:4 1st Qu.: 7.25
POOR:4 Median : 9.50
Mean : 9.50
3rd Qu.:11.75
Max. :14.00
Warning messages:
1: In if (length(ll) > maxsum) { :
the condition has length > 1 and only the first element will be used
2: In if (length(ll) > maxsum) { :
the condition has length > 1 and only the first element will be used