1

I am asking myself the following question.

Is there a smart way to solve the problem using the package data.table instead of using the following code:

install.packages("dplyr")
library(dplyr)
data %>% group_by(Ticker, Year) %>% summarise(count = length(Value[!is.na(Value)]))
bli12blu12
  • 357
  • 2
  • 11

1 Answers1

1

Do you mean this?

(Note: Sample data is based on data provided in your previous post here).

library(data.table);
setDT(df)[, .(count = sum(!is.na(Value))), by = list(RANDOM, Year)];
#    RANDOM Year count
# 1:      D 2010     2
# 2:      C 2010     2
# 3:      B 2008     5
# 4:      D 2009     4
# 5:      D 2008     4
# 6:      A 2009     3
# 7:      B 2009     5
# 8:      C 2008     4
# 9:      A 2008     8
#10:      A 2010     2
#11:      B 2010     1
#12:      C 2009     8

Sample data

set.seed(2017);
RANDOM <- sample(c("A","B","C","D"), size = 100, replace = TRUE)
Year <- sample(c(2008,2009,2010), 100, TRUE)
Value <- sample(c(0.22, NA), 100, TRUE)
df <- data.frame(RANDOM, Year, Value);
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68