finding duplicate rows counts by keeping all rows

Question

I have a data frame df1

I need to find duplicate count of rows but also keeping the duplicate rows.The result should be like:

    a  c count
1:  1  6   1
2:  2  8   2
3:  3  1   1
4: 45  3   1  
5:  2  8   2

as rows 2 and 5 are duplicates.But I am only able to get the solution that would give the answer

    a  c count
1:  1  6   1
2:  2  8   2
3:  3  1   1
4: 45  3   1

by doing

 df1<-data.table(df1)    
 df1[, .N, by = list(a,c)]

How could I get the desired result?

You're basically there.... `dt[ , count := .N , by=list(a,c) ]` — Simon O'Hanlon, Apr 08 '14 at 19:45
Hi, Take a bit of time and read the tag excerpt before tagging. [tag:dataframes] is for pandas, whereas you need [tag:data.frame] here. Be careful the next time. See this meta post. [Warn \[r\] users from adding \[dataframes\] tag instead of \[data.frame\] tag](http://meta.stackoverflow.com/q/318933) — Bhargav Rao, Mar 14 '16 at 15:05

score 3 · Accepted Answer · answered Apr 08 '14 at 19:53

3

You may also do it in base R:

df1$count <- with(df1, ave(a, list(a, c), FUN = length))

df1
#     a c count
# 1:  1 6     1
# 2:  2 8     2
# 3:  3 1     1
# 4: 45 3     1
# 5:  2 8     2

answered Apr 08 '14 at 19:53

Henrik

65,555
14
143
159

score 3 · Answer 2 · answered Apr 08 '14 at 20:02

3

For completeness, here's a way with dplyr

df <- data.frame(
  a = c(1, 2, 3, 45, 2),
  c = c(6, 8, 1, 3, 8)
)

library(dplyr)

df %.% group_by(a, c) %.% mutate(count = n())

## Source: local data frame [5 x 3]
## Groups: a, c
## 
##    a c count
## 1  1 6     1
## 2  2 8     2
## 3  3 1     1
## 4 45 3     1
## 5  2 8     2

answered Apr 08 '14 at 20:02

hadley

102,019
32
183
245

2

From `?n`: "This function is implemented special for each data source and **can only be used from within summarise** (`dplyr` 0.1.3). But you are @hadley and can use it wherever you want! ;) +1 for magic. – Henrik Apr 08 '14 at 20:35
@Henrik hmmm, that should really say inside summarise, mutate, or filter – hadley Apr 09 '14 at 01:41

finding duplicate rows counts by keeping all rows

2 Answers2