Create fake data to recreate a contingency table

Question

Is possible to create fake data to recreate a contingency table?

For example:

originalTable <- matrix(c(188, 29, 20, 51), ncol = 2, byrow = TRUE)
colnames(originalTable) <- c("A", "B")
rownames(originalTable) <- c("C", "D")

Is possible to generate a dataframe from the table with 288 pairs of observations that match the table?

I found the r2dtable function, but any idea how to extract or save as dataframe?

r2dtable(1, c(217, 71), c(208, 80))

Thanks in advance

After the `data.frame(as.table(originalTable))` step, you find may alternatives here: [Replicate each row of data.frame and specify the number of replications for each row](https://stackoverflow.com/questions/2894775/replicate-each-row-of-data-frame-and-specify-the-number-of-replications-for-each) — Henrik, Mar 04 '18 at 08:44
Another: [Finding the original vectors from an interaction table](https://stackoverflow.com/questions/19841768/finding-the-original-vectors-from-an-interaction-table) — Henrik, Mar 04 '18 at 11:44

score 2 · Answer 1 · answered Mar 04 '18 at 03:18

2

You can just use the table to generate the correct number of pairs.

x = c()
for(row in rownames(originalTable)) {
    for(col in colnames(originalTable)) {
        x = rbind(x, matrix(rep(c(row, col), originalTable[row,col]), ncol=2, byrow=TRUE))
    }
}

df = as.data.frame(x)
table(df)
   V2
V1    A   B
  C 188  29
  D  20  51

answered Mar 04 '18 at 03:18

G5W

36,531
10
47
80

great, that's exactly was I was searching! I was trying to recreate some datasets that I found in textbooks and papers. Thanks again! – sergiouribe Mar 04 '18 at 03:24

score 2 · Accepted Answer · answered Mar 04 '18 at 04:02

You can use expandRows from my "splitstackshape" package:

library(splitstackshape)
expandRows(data.frame(as.table(originalTable)), "Freq")
#       Var1 Var2
# 1        C    A
# 1.1      C    A
# 1.2      C    A
# 1.3      C    A
# 1.4      C    A
# -----
# 1.19     C    A
# 1.20     C    A
# 1.21     C    A
# 1.22     C    A
# 1.23     C    A
# 1.24     C    A
# 1.25     C    A
# 1.26     C    A
# 1.27     C    A
# -----
# 4.43     D    B
# 4.44     D    B
# 4.45     D    B
# 4.46     D    B
# 4.47     D    B
# 4.48     D    B
# 4.49     D    B
# 4.50     D    B

nrow(.Last.value)
# [1] 288
sum(originalTable)
# [1] 288

You won't need the as.table if you're already dealing with an actual table.

Of course, you can also do it without a package:

data.frame(as.table(originalTable))[rep(sequence(prod(dim(originalTable))), 
                                        c(originalTable)), c(1, 2)]

thanks, this is simpler! Great package, didn't knew it, kudos! — sergiouribe, Mar 04 '18 at 05:10

score 0 · Answer 3 · answered Apr 20 '23 at 06:37

0

Try:

ctab <- as.table(matrix(c(12, 6, 6, 22), ncol=2))
ctab
wtab <- as.data.frame(ctab)
wtab
index <- rep(1:nrow(wtab), wtab$Freq)
utab  <- wtab[index,]
utab

answered Apr 20 '23 at 06:37

sigbert

77
5

Create fake data to recreate a contingency table

3 Answers3