This question is similar to this one, but in a summarise so the posted answer doesn't quite fit. The data are such that a row identifies a unit-time pair:
large_sql_df
id t var1 var2
1 1 10 0
1 2 20 1
2 1 11 0
And I would like to aggregate by var2
and time t
:
localdf <- large_sql_df %>%
group_by(var, t) %>%
summarise(count = n(), var1_mean = mean(var1))
This gives the error: "Arithmetic overflow error converting expression to data type int." I think this is because count
becomes a very large number. Is there a way to stop this from happening without having to do the entire query in SQL?