how to add column with counts of duplicate rows to a dataframe

Question

I have data as follows:

Date         ID   Value
01-10-2016   1      5
01-10-2016   2      8
01-10-2016   3      7
02-10-2016   1      3
02-10-2016   1      5
02-10-2016   2      8
02-10-2016   3      6
....

I want to identify rows where the value in the Date and ID column are identical, and add an extra column to the data frame. The column value should equal the number of times it is duplicated.

Currently, I am using a for loop.

df$DateID <- paste(df$Date, df$ID) # merge date & ID col

n_occur <- data.frame(table(df$DateID)) # count the no. of occurrences

multi <- c()  # create a new vector, to save the indices as many times as the number of occurrences

for (i in 1:nrow(n_occur)) {
   multi <- c(multi, rep(n_occur$Freq[i], n_occur$Freq[i]))
}

df <- cbind(df, multi)

output

Date         ID   Value  multi
01-10-2016   1      5     1
01-10-2016   2      8     1
01-10-2016   3      7     1
02-10-2016   1      3     2
02-10-2016   1      5     2
02-10-2016   2      8     1
02-10-2016   3      6     1
....

I would like to know if there is a simpler way to do this.

I have looked at the following posts, but they are only labeling the unique entries.

Create New Column If Statement Based on Duplicate Rows in R

Find duplicated rows (based on 2 columns) in Data Frame in R

Find duplicate values in R

Using dplyr, `df %>% group_by(Date, ID) %>% mutate(new = n())` or with base R, `with(df, ave(Value, Date, ID, FUN = length))` — Sotos, Jun 29 '17 at 08:23
just found the following link - it may help: https://stackoverflow.com/questions/20275325/add-column-with-counts-of-another — Sree, Jun 29 '17 at 08:26
There is also this answer which summarizes several methods: https://stackoverflow.com/a/7450633/3817004 — Uwe, Jun 29 '17 at 09:30

how to add column with counts of duplicate rows to a dataframe

I would like to know if there is a simpler way to do this.

0 Answers0