0

I have two varaibles A and B

a
neat
neat
sweet
sweet

I want to group the variable a and get the no of rows in each group. In the above case it will be 2 for each group

12345
  • 67
  • 1
  • 6
  • The `dplyr`: `df %>% group_by(a) %>% summarise(n = n())` or just `df %>% count(a)`. `table(df$a)` works too, depending on what you're trying to do. – alistaire May 19 '16 at 05:52

2 Answers2

0

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'a', we assign (:=) a new column ('b') with the number of rows (.N).

setDT(df1)[, b := .N, by = a]

Or using ave from base R

df1$b <- with(df1, ave(seq_along(a), a, FUN = length)) 

Or if the 'a' column is ordered,

df1$b <- cumsum(!duplicated(df1$a))

If we need the summarized output instead of creating a new column

setDT(df1)[, .(b = .N), by = a]
#       a b
#1:  neat 2
#2: sweet 2

Or with base R, we can use tabulate which will be very fast.

tabulate(factor(df1$a))    
akrun
  • 874,273
  • 37
  • 540
  • 662
0

You may use aggregate

aggregate(df, list(df$a), length)

#   Group.1 a
#1    neat  2
#2   sweet  2

Or as mentioned in comments by @alistaire you can also use table to get the frequency of each unique word

table(df$a)

# neat sweet 
#  2     2 
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213