Questions tagged [contingency]

A contingency table is a non-negative integer matrix with specified row and column sums.

A contingency table is a non-negative integer matrix with specified row and column sums, so named by Karl Pearson in developing statistical tests of significance. Observations are counted in a table with appropriate row and column labels, whereby statistical tests may be done on the entries to determine how likely the results would arise if the row and column outcomes were independent events.

Given specified row and column sums, counting the number of possible contingency tables can be a hard problem. Indeed even the case of $2$ rows and $n$ columns is known to be #P-complete.

However existence of solutions, unless otherwise constrained, is easy: it is necessary and sufficient that the row sums and column sums give equal totals for the entries of the entire matrix (balance condition).

An example of a further constraint would be requiring 0/1 entries, called binary contingency tables. Necessary and sufficient criteria for these restricted solutions were given by Gale and Ryser (independently) in 1957.

242 questions
26
votes
6 answers

How do I get a contingency table?

I am trying to create a contingency table from a particular type of data. This would be doable with loops etc... but because my final table would contain more than 10E5 cells, I am looking for a pre-existing function. My initial data are as…
Julien
  • 509
  • 1
  • 5
  • 8
9
votes
5 answers

Creating a contingency table using multiple columns in a data frame in R

I have a data frame which looks like this: structure(list(ab = c(0, 1, 1, 1, 1, 0, 0, 0, 1, 1), bc = c(1, 1, 1, 1, 0, 0, 0, 1, 0, 1), de = c(0, 0, 1, 1, 1, 0, 1, 1, 0, 1), cl = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 2)), .Names = c("ab", "bc", "de", "cl"),…
Apricot
  • 2,925
  • 5
  • 42
  • 88
8
votes
2 answers

Creating a Contingency table in Pandas

I want to create a contingency table in Pandas. I can do it with the following code but I wondered if there is a pandas function that would do it for me. For a reproducible example: toy_data…
user8270077
  • 4,621
  • 17
  • 75
  • 140
8
votes
1 answer

How to add the total sums to the table and get proportion for each cell in R

I'm trying to get a proportion for each cell by dividing each row by the row sum, but R gives me an Error saying, Error in data.table$Country : $ operator is invalid for atomic vectors How can I fix this? Also, how can add total sum values for…
Pirate
  • 311
  • 1
  • 5
  • 12
7
votes
5 answers

An event log source that's always available for writing?

Is there an event log source that's always available for writing by an ASP.NET webapp? Backstory, in case anyone has a seemingly unrelated solution: Our ASP.NET webapp uses its own event log source, but it doesn't have the rights to create it. So,…
lance
  • 16,092
  • 19
  • 77
  • 136
5
votes
1 answer

produce contingency table in rmarkdown using kable

I am trying to produce a well formatted contingency table in an rmarkdonw html document. Here is the code: --- title: "Probabilidad" author: "Nicolás Molano Gonzalez" date: "7 de Abril de 2020" output: html_document: fig_caption: true --- ```{r…
Nicolas Molano
  • 693
  • 4
  • 15
5
votes
1 answer

p-value from fisher.test() does not match phyper()

The Fisher's Exact Test is related to the hypergeometric distribution, and I would expect that these two commands would return identical pvalues. Can anyone explain what I'm doing wrong that they do not match? #data (variable names chosen to match…
R-Peys
  • 123
  • 1
  • 9
5
votes
5 answers

Getting values that appear exactly n-times

I specifically started to think in this problem trying to get the values form a vector that were not repeated. unique is not good (up to what I could collect from the documentation) because it gives you repeated elements, but only once. duplicated…
user4095160
4
votes
4 answers

Contingency Table in Mathematica

Trying to build what I believe to be a contingency table, please consider the following : dist = Parallelize[Table[RandomVariate[NormalDistribution[]], {100000}]]; dist2 = Rest@FoldList[0.95 # + #2 &, 0, dist]; dist3 = Rest@FoldList[0.95 # +…
500
  • 6,509
  • 8
  • 46
  • 80
4
votes
1 answer

Python: Chi 2 test produces wrong results (chi2_contingency)

I am trying to calculate the Chi square value in python, using a contingency table. Here is an example. +--------+------+------+ | | Cat1 | Cat2 | +--------+------+------+ | Group1 | 80 | 120 | | Group2 | 420 | 380…
valenzio
  • 773
  • 2
  • 9
  • 21
4
votes
1 answer

Include zero frequencies in 2-way frequency/contingency table

I am trying to make a contingency (frequency) table using table() in R for two integer variables, but the default option in table does not include all the values in the range for each. For example: a=c(1,2,3,5) b=c(1,1,2,3) table(a,b) returns: 1…
Justina Pinch
  • 367
  • 1
  • 7
4
votes
5 answers

How to get a data.frame with cases from a contingency table in r?

I would like to reproduce some calculations from a book (logit regression). The book gives a contingency table and the results. Here is the Table: . example <- matrix(c(21,22,6,51), nrow = 2, byrow = TRUE) #Labels: …
Martin
  • 307
  • 3
  • 10
4
votes
1 answer

How can you force inclusion of a level in a table in R?

Is there a way to force R's table function to include rows or columns even when they never occur in the data? For example, data.1 <- c(1, 2, 1, 2, 1, 2, 4) data.2 <- c(1, 4, 3, 3, 3, 1, 1) table(data.1, data.2) returns data.2 data.1 1 3 4 …
Andrew Steele
  • 493
  • 6
  • 16
3
votes
2 answers

Venn diagrame from contingency table in R

I have a data like contingency table, which display abundance of data, but I want to draw venn diagram from this data fram. Structure of my data: species_abundance<-data.frame(Genus = c("Parasphingorhabdus", "Loktanella", "Cytobacillus",…
Umar
  • 117
  • 7
3
votes
2 answers

Build a sparse matrix with items coexistence frequency (to analyze cross-selling of products)

I stuck with creating a sparse matrix, in which I can count cross-selling frequency of products based on the cart and product ids. Sample data frame: x = data.frame( cart_id = c("1","1","1","2","2","3","4","5","5","6"), product_id =…
Bart
  • 128
  • 8
1
2 3
16 17