chi square test for each row in data frame

Question

I have a data frame containing independent counts of two observers of the same process.

obs.1 <- c(2,10,53,13,12,15,5)
obs.2 <- c(3,12,45,2,7,17,5)
df <- data.frame(obs.1,obs.2)

I want to use a chi-square test (chisq.test in R "MASS") on each row to see if there is a significant difference between obs.1 to obs.2. I would like to add the results (x-squared, p-value) to the df. I have the feeling the apply function is the correct way to implement this but haven't been successful.

Have you tried `cbind(df, t(apply(df, 1, function(x) {ch <- chisq.test(x); c(unname(ch$statistic), ch$p.value)})))` — akrun, Jan 28 '15 at 12:57
@CathG I am using chisq as it is used in other similar examples. Kappa is for categorical data only? — doncarlos, Jan 28 '15 at 13:27
@doncarlos If you have doubts about which test to use ( in general statistical questions), http://stats.stackexchange.com/ might be a better place to post the question — akrun, Jan 28 '15 at 13:38
@akrun, after more intense thinking (...), I'll change my first idea of kappa to a wilcoxon (or t test, depending on the number of points), mostly because kappa is indeed more appropriate for categorical data and so little changes in the value between obervers can yield in bad kappa coeff, while it may not actually be significant difference. but I guess, it really depends on the "nature of data" — Cath, Jan 28 '15 at 13:39
@CathG Thanks for your thoughts. Just to clarify; two observers are independently looking at the same process (objects passing) and count what they see. For each row the color of the objects is different and so I want to see if there are any statistical differences between the two observers / color combinations. — doncarlos, Jan 28 '15 at 13:40
so I definitely would go for a pairwise test (and really definitely not a chi-square by row...) but as @akrun said, you can ask this question on stats.exchange. — Cath, Jan 28 '15 at 13:42

score 8 · Answer 1 · answered Jan 28 '15 at 14:38

8

Here is another option using dplyr:

library(dplyr)

df %>%
  rowwise() %>% 
  mutate(
    test_stat = chisq.test(c(obs.1, obs.2))$statistic,
    p_val = chisq.test(c(obs.1, obs.2))$p.value
    )

answered Jan 28 '15 at 14:38

davechilders

8,693
2
18
18

score 3 · Accepted Answer · answered Jan 28 '15 at 13:02

3

You can use apply with "MARGIN =1" to and then do the chisq.test. Extract the values using $statistic and $p.value and cbind it to the dataset.

 df1 <- cbind(df, t(apply(df, 1, function(x) {
             ch <- chisq.test(x)
             c(unname(ch$statistic), ch$p.value)})))

 colnames(df1)[3:4] <- c('x-squared', 'p-value')

answered Jan 28 '15 at 13:02

akrun

874,273
37
540
662

this works. initially had an issue as a few rows contain NA. This was resolved with (na.omit(data)). – doncarlos Jan 28 '15 at 13:25

score 2 · Answer 3 · answered Jan 28 '15 at 13:02

There's a number of ways to do this. One is using apply to go through each line (MARGINE = 1) and then extract whatever part of the output you want (I use lapply to climb through each list element).

xy <- data.frame(obs1 = c(3,12,45,2,7,17,5), obs2 = c(2,10,53,13,12,15,5))
result <- apply(X = xy, MARGIN = 1, FUN = chisq.test)

Warning message:
In FUN(newX[, i], ...) : Chi-squared approximation may be incorrect

# see where p-value is stored
str(chisq.test(xy[1, ]))

List of 9
 $ statistic: Named num 0.2
  ..- attr(*, "names")= chr "X-squared"
 $ parameter: Named num 1
  ..- attr(*, "names")= chr "df"
 $ p.value  : num 0.655 # thar she blows
 $ method   : chr "Chi-squared test for given probabilities"
 $ data.name: chr "xy[1, ]"
 $ observed : num [1:2] 3 2
 $ expected : num [1:2] 2.5 2.5
 $ residuals: num [1:2] 0.316 -0.316
 $ stdres   : num [1:2] 0.447 -0.447
 - attr(*, "class")= chr "htest"

Warning message:
In chisq.test(xy[1, ]) : Chi-squared approximation may be incorrect

unlist(lapply(result, "[", "p.value"), use.names = FALSE)

[1] 0.654720846 0.669815358 0.419020334 0.004508698 0.251349109 0.723673610 1.000000000

chi square test for each row in data frame

3 Answers3

Linked