I have the following datatable:
library(tidyverse)
df <- data.frame(READS=rep(c('READa', 'READb', 'READc'),each=3) ,GENE=rep(c('GENEa', 'GENEb', 'GENEc'), each=3), COMMENT=rep(c('CommentA', 'CommentA', 'CommentA'),each=3))
> df
READS GENE COMMENT
1 READa GENEa CommentA
2 READa GENEa CommentA
3 READa GENEa CommentA
4 READb GENEb CommentA
5 READb GENEb CommentA
6 READb GENEb CommentA
7 READc GENEc CommentA
8 READc GENEc CommentA
9 READc GENEc CommentA
I want to produce the following which works with a small dataframe.
df %>%
count(READS, GENE) %>%
pivot_wider(
names_from = GENE, values_from = n,
values_fill = list(n = 0)
)
A tibble: 3 x 4
READS GENEa GENEb GENEc
<chr> <int> <int> <int>
1 READa 3 0 0
2 READb 0 3 0
3 READc 0 0 3
The input dataframe is very large 27748156 rows (roughly 27 million rows). With such a big table i get the following error.
Any idea how can i deal with such a big table ?
Error: Can´t index beyond the end of a vector.
The vector has length 1 and you´ve tried to submit element 712.