Independence between two quantitative variables

Question

I want to test if there is a dependency between 2 qualitatives variables. Before using any test, I plot geom_bar().

For me, this is quite evident that when the factor variable is equal to 1, the dependent variable is more often equal to 3 than when the factor variable is equal to 0. And when the factor variable is equal to 0, the dependent variable is more often equal to 2 than when the factor variable is equal to 1.

But if I perform chisq.test or fisher.test, I get a p-value equal superior to 0.3, which means that the two qualitatives variables are independent. But I don't really understand why the test are not significant. To perform the tests, I've used following code :

chisq.test(table(variable1,variable2))

where variable1 and variable2 are categorical variables

Thanks in advance for your help,

C

We really need to see the data. A significant difference is based on the sample size so looking at a bar chart of percentages does not help. Use `dput(variable1)` and `dput(variable2)` and paste the results into your question as a code sample. — dcarlson, May 11 '21 at 14:01

Recap_Hessian · Answer 1 · 2021-05-11T16:48:28.363

1

Here's a detailed way:

#function borrowed from https://stackoverflow.com/a/32544987/4938484
#to maintain the right sum of entries when rounding
smart.round <- function(x) {
  y <- floor(x)
  indices <- tail(order(x-y), round(sum(x)) - sum(y))
  y[indices] <- y[indices] + 1
  y
}

N = 100 #change to appropriate sample size
tab <- matrix(c(8.1, 51.4, 40.5, 3.7, 37.0, 59.3), ncol=3, byrow=TRUE)
tab <- smart.round(tab/100 * N)
#values in tab were assigned from your bar chart
rownames(tab) <- c("0", "1")
colnames(tab) <- c("1", "2","3")
tab <- as.table(tab)
chisq.test(tab)
#which gives p-value = 0.03

edited May 11 '21 at 16:48

answered May 11 '21 at 14:41

Recap_Hessian

368
1
10

@user20650 Yeah, it's probably not accurate to apply on percentages. Ideally they would multiply all entries of the table by sample size. – Recap_Hessian May 11 '21 at 16:09
agreed; I think the code in the question looks correct. For OP, perhaps the counts / n are small hence non-significance. Just showing % can be misleading. – user20650 May 11 '21 at 16:12
1

@user20650 Updated to reflect that. – Recap_Hessian May 11 '21 at 16:19

Independence between two quantitative variables

1 Answers1