Using the iris
dataset I'm trying to calculate a z score for each of the variables. I have the data in tidy format, by performing the following:
library(reshape2)
library(dplyr)
test <- iris
test <- melt(iris,id.vars = 'Species')
That gives me the following:
Species variable value
1 setosa Sepal.Length 5.1
2 setosa Sepal.Length 4.9
3 setosa Sepal.Length 4.7
4 setosa Sepal.Length 4.6
5 setosa Sepal.Length 5.0
6 setosa Sepal.Length 5.4
But when I try to create a z-score column for each group (e.g. the z-score for Sepal.Length will not be comparable to that of Sepal. Width) using the following:
test <- test %>%
group_by(Species, variable) %>%
mutate(z_score = (value - mean(value)) / sd(value))
The resulting z-scores have not been grouped, and are based on all of the data.
What's the best way to return the z-scores by group using dpylr?
Many thanks!