I'd like to convert a data frame to a disk frame and then count the first column. It's not counting the number of unique values of the column when I try it. It appears to be counting the number of workers.
library(disk.frame)
options(future.globals.maxSize = Inf)
setup_disk.frame(workers = 8)
This is an example dataset
bigint <- sample(123901239804:901283455390, 3*10^5)
df <- data.frame(bigint)
df %>%
summarize(ints = length(unique(bigint)))
df %>%
as.disk.frame %>%
summarize(ints = length(bigint)) %>%
collect
In the first query, it gets me this output
ints
1 300000
In the second query, it gets me this output
ints
1 8