This question is based on How do I calculate a grouped z score in R using dplyr?.
Here data are scaled (zscores) for different groups and ungrouped.
dat = iris %>%
gather(variable, value, -Species) %>%
group_by(Species, variable) %>%
mutate(z_score_group = (value - mean(value)) / sd(value)) %>%
ungroup %>%
mutate(z_score_ungrouped = (value - mean(value)) / sd(value))
Scaling ungrouped preserves the order of the data.
> identical(order(dat$z_score_ungrouped), order(dat$value))
[1] TRUE
However, interestingly the data change their order by scaling group wise.
> identical(order(dat$z_score_group), order(dat$value))
[1] FALSE
In my opinion scaling should never change the order of data because this has a huge impact on rank based analysis (e.g. ROC-curves). Does anyone have an idea why grouping changes the order?