I have the following dataframe:
species <- c("a","a","a","b","b","b","c","c","c","d","d","d","e","e","e","f","f","f","g","h","h","h","i","i","i")
category <- c("h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","l","h","l","m","h","l","m")
minus <- c(31,14,260,100,70,200,91,152,842,16,25,75,60,97,300,125,80,701,104,70,7,124,24,47,251)
plus <- c(2,0,5,0,1,1,4,4,30,1,0,0,2,0,5,0,0,3,0,0,0,0,0,0,4)
df <- cbind(species, category, minus, plus)
df<-as.data.frame(df)
I want to do a chisq.test for each category-species combination, like this:
Species a, category h and l: p-value
Species a, category h and m: p-value
Species a, category l and m: p-value
Species b, ... and so on
With the following chisq.test (dummy code):
chisq.test(c(minus(cat1, cat2),plus(cat1, cat2)))$p.value
I want to end up with a table that presents each chisq.test p-value for each comparison, like this:
Species Category1 Category2 p-value
a h l 0.05
a h m 0.2
a l m 0.1
b...
Where category and and category 2 are the compared categories in the chisq.test.
Is this possible to do using dplyr? I have tried tweaking what was mentioned in here and here, but they don't really apply to this issue, as I am seeing it.
EDIT: I also would like to see how this could be done for the following dataset:
species <- c(1:11)
minus <- c(132,78,254,12,45,76,89,90,100,42,120)
plus <- c(1,2,0,0,0,3,2,5,6,4,0)
I would like to do a chisq. test for each species in the table compared to every single other species in the table (a pairwise comparison between each species for all species). I want to end up with something like this:
species1 species2 p-value
1 2 0.5
1 3 0.7
1 4 0.2
...
11 10 0.02
I tried changing the code above to the following:
species_chisq %>%
do(data_frame(species1 = first(.$species),
species2 = last(.$species),
data = list(matrix(c(.$minus, .$plus), ncol = 2)))) %>%
mutate(chi_test = map(data, chisq.test, correct = FALSE)) %>%
mutate(p.value = map_dbl(chi_test, "p.value")) %>%
ungroup() %>%
select(species1, species2, p.value) %>%
However, this only created a table where each species was only compared to itself, and not the other species. I do not quite understand where in the original code given by @ycw it specifies which are compared.
EDIT 2:
I managed to do this by the code found here.