I have a dataset that measures the abundance of macroinvertebrates from multiple sample sites. I wish to compare results from the most recent years of sampling with results from all previous years of sampling at the same sites.
My data looks like this:
# A tibble: 6 x 5
basin sitecode sampleid metric value
<fct> <chr> <int> <chr> <dbl>
1 arctic coast islands HUSK1 13482 s_abundance1 5312
2 arctic coast islands HUSK1 13482 s_abundance2 NA
3 arctic coast islands NOEL1 13488 s_abundance1 616
4 arctic coast islands NOEL1 13488 s_abundance2 NA
5 arctic coast islands RPR070 6815 s_abundance1 NA
6 arctic coast islands RPR070 6815 s_abundance2 697
>
s_abundance1 stands for site abundance for most recent site and s_abundance2 stands for site abundance at previously sampled site(s)
The entire dataset is about 4000 rows and is comprised of sample data for many different drainage basins.
I would like to perform a mann-whitney u test comparing s_abundance1 against s_abundance2, but grouped by basin in a single output
The code I have been using is:
abund_results %>%
+ group_by(basin) %>%
+ summarise(tidy(wilcox.test(abund_results$value ~ abund_results$metric, data = .)))
It seems to work, except that all of the p-values come out exactly the same. Here is the output:
abund_results %>%
+ group_by(basin) %>%
+ summarise(tidy(wilcox.test(abund_results$value ~ abund_results$metric, data = .)))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 17 x 5
basin statistic p.value method alternative
<fct> <dbl> <dbl> <chr> <chr>
1 arctic coast islands 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
2 columbia 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
3 fraser lower mainla… 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
4 great lakes 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
5 lower mackenzie 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
6 lower saskatchewan-… 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
7 maritime coastal 181204. 5.82e-108 Wilcoxon rank sum test with continuit… two.sided
What do I need to change to get different results for each basin?