1

I would like to create a new frequency column and fill it with the total count of each unique value in item. I've tried:

df$frequency <- sum(df$item) #gives me total sum
df$frequency <- sum(unique(df$item)) # gives me 6 for some reason
df$frequency <- sum(df$item == 1) #gives me total count per selected value

But I would really like to generate them all at once.

example data:

> df <- data.frame("item" = c(1,1,1,1,2,2,2,3))
> df
  item
1    1
2    1
3    1
4    1
5    2
6    2
7    2
8    3

desired output:

> df
  item frequency
1    1         4
2    1         4
3    1         4
4    1         4
5    2         3
6    2         3
7    2         3
8    3         1

Much thanks in advance!

pyne
  • 507
  • 1
  • 5
  • 16

3 Answers3

4

You can use table

df$frequency <- table(df$item)[df$item]

#   item frequency
#1    1         4
#2    1         4
#3    1         4
#4    1         4
#5    2         3
#6    2         3
#7    2         3
#8    3         1

Or with ave

df$frequency <- ave(1:nrow(df), df$item, FUN = length)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 1
    Your second is more robust, for example`df <- data.frame("item" = c(5,1,2,2,2,5,4))` does not work with your first suggestion – Henry Oct 04 '16 at 07:13
2

You can try with data.table: create a column frequency that corrresponds to the total number of items, by item:

library(data.table)
setDT(df)[, frequency:=.N, by=item]
df
#   item frequency
#1:    1         4
#2:    1         4
#3:    1         4
#4:    1         4
#5:    2         3
#6:    2         3
#7:    2         3
#8:    3         1
Cath
  • 23,906
  • 5
  • 52
  • 86
0

Do you want something like this?

df <- data.frame("item" = c(2,2, 1,1,1,1,2,2,2,3))
df <- data.frame(item=df[order(df$item),]) # if items are not ordered
df$frequency <- as.integer(rep(table(df), table(df)))
df
    item frequency
1     1         4
2     1         4
3     1         4
4     1         4
5     2         5
6     2         5
7     2         5
8     2         5
9     2         5
10    3         1
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
  • 3
    This doesn't work when the item column is not ordered. eg when the item column is `c(2,2,1,1,1,1,2,2,2,3)`. – 9Heads Oct 04 '16 at 06:54
  • From your example assumed that the columns are ordered, if not, can't you reorder them before? – Sandipan Dey Oct 04 '16 at 06:55
  • But i may not want to sort the dataset wrt the item column assuming there are other columns in the dataset whereas the other answer gives correct answer regardless of ordering. – 9Heads Oct 04 '16 at 06:58