1

I have data frame with 10 columns for this data frame I am trying to get count over year and val_1 with below formula. Which I am getting error as

"unused argument (by = year)" in data.table

year   val_1 . . . . . val_50
2012    A    . . . . . 23
2013    B    . . . . . 43

I have tried to get output with below code:

df$cnt = df[,.N,by=year,val_1 ] 

How should I modify the above code?

Jaap
  • 81,064
  • 34
  • 182
  • 193
sasir
  • 167
  • 10
  • 2
    You are using `data.table` syntax on a dataframe. If you want to use `data.table`-syntax, you need to convert your dataframe to a `data.table` with `setDT(df)`. – Jaap Sep 06 '17 at 05:59
  • Thus, you (highly probably) need: `library(data.table); setDT(df)[, cnt := .N, by = year, .SDcols = val_1]` to get it working. – Jaap Sep 06 '17 at 06:04

1 Answers1

2

If I understand you correctly, you're looking for count data per year, for each of your columns.

Try using group_by() and summarise() with dyplyr

library(dplyr) 
df %>% group_by(year) %>%
  summarise(count = n())
Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69