R data.table fast add with binom.test

Question

I am trying to run binom.test on a data.table with both the X and N values provided for each row. I saw this post, which uses a static N value and tried to modify, but if I try I get:

dt = data.table(X=rbinom(100, 625, 1/5), N=rbinom(100, 625, 4/5))
dt[, P := binom.test(x=X, n=N)$p.value ]
# Error in binom.test(x = X, n = N) : incorrect length of 'x'

The post also mentions aggregating by=X, but even still I get:

dt[, P := binom.test(x=X, n=N)$p.value, by=X ]
# Error in binom.test(x = X, n = N) : 'n' must be a positive integer >= 'x'

Despite N always being a positive integer greater than X. My goal is not to group by values of X though, I want a binom.test p-value for every row.

It ranges from 50k to over 10 million, depending on the file. — dragon951, Feb 18 '20 at 18:15

score 1 · Accepted Answer · answered Feb 18 '20 at 00:04

1

We could group by every row and apply binom.test on it.

library(data.table)

dt[, P := binom.test(x=X, n=N)$p.value, seq_len(nrow(dt))]
#which is same as
#dt[, P := binom.test(x=X, n=N)$p.value, 1:nrow(dt)]

answered Feb 18 '20 at 00:04

Ronak Shah

377,200
20
156
213

Do you have a good link to explain why data.table requires the group by operation to do the row operations? I couldn't find anything at rdatatable.gitlab.io – dragon951 Feb 18 '20 at 18:18
@dragon951 As we want `p.value` for every row here, we group them by row. I am not aware if this use case is explained anywhere. – Ronak Shah Feb 18 '20 at 23:39

score 1 · Answer 2 · answered Feb 18 '20 at 00:05

We can use Map to loop over each of the corresponding elements of 'X' and 'N'

library(data.table)
dt[,  P := unlist(Map(function(x, y) binom.test(x = x, n = y)$p.value,  X, N))]
head(dt)
#     X   N            P
#1: 104 510 3.737474e-43
#2: 137 501 8.640380e-25
#3: 140 517 3.982312e-26
#4: 131 498 6.476382e-27
#5: 114 506 1.000591e-36
#6: 120 507 8.940756e-34

Or without anonymous function call

dt[, P := sapply(Map(binom.test, x = X, n = N), `[[`, "p.value")]

R data.table fast add with binom.test

2 Answers2