0

Say I have a data table like this:

id       days       age
"jdkl"   8          23
"aowl"   1          09
"mnoap"  4          82
"jdkl"   3          14
"jdkl"   2          34
"mnoap"  27         56

I want to create a new data table that has one column with the ids and one column with the number of times they appear. I know that data table has something with =.N, but I wasn't sure how to use it for only one column.

The final data table would look like this:

id          count
"jdkl"      3
"aowl"      2
"mnoap"     1
codercc
  • 89
  • 8

2 Answers2

2

You can just use table from base R:

as.data.frame(sort(table(df$id), decreasing = T))

However, if you want to do it using data.table:

library(data.table)
setDT(df)[, .(Count = .N), by = id][order(-Count)]

or there is the dplyr solution

library(dplyr)
df %>% count(id) %>% arrange(desc(n))
Sumedh
  • 4,835
  • 2
  • 17
  • 32
  • 1
    @shayaa: I arranged it to match the output posted by OP. Too lazy to arrange the output of base `R` :D – Sumedh Jul 29 '16 at 00:28
1

We can use

library(dplyr)
df %>%
    group_by(id) %>%
    summarise(Count = n()) %>%
    arrange(desc(Count))

Or using aggregate from base R

r1 <- aggregate(cbind(Count=days)~id, df1, length)
r1[order(-r1$Count),]
#      id Count
#2  jdkl     3
#3 mnoap     2
#1  aowl     1
akrun
  • 874,273
  • 37
  • 540
  • 662