2

I have data as follows:

Date         ID   Value
01-10-2016   1      5
01-10-2016   2      8
01-10-2016   3      7
02-10-2016   1      3
02-10-2016   1      5
02-10-2016   2      8
02-10-2016   3      6
....

I want to identify rows where the value in the Date and ID column are identical, and add an extra column to the data frame. The column value should equal the number of times it is duplicated.

Currently, I am using a for loop.

df$DateID <- paste(df$Date, df$ID) # merge date & ID col

n_occur <- data.frame(table(df$DateID)) # count the no. of occurrences

multi <- c()  # create a new vector, to save the indices as many times as the number of occurrences

for (i in 1:nrow(n_occur)) {
   multi <- c(multi, rep(n_occur$Freq[i], n_occur$Freq[i]))
}

df <- cbind(df, multi)

output

Date         ID   Value  multi
01-10-2016   1      5     1
01-10-2016   2      8     1
01-10-2016   3      7     1
02-10-2016   1      3     2
02-10-2016   1      5     2
02-10-2016   2      8     1
02-10-2016   3      6     1
....

I would like to know if there is a simpler way to do this.

I have looked at the following posts, but they are only labeling the unique entries.

Create New Column If Statement Based on Duplicate Rows in R

Find duplicated rows (based on 2 columns) in Data Frame in R

Find duplicate values in R

Sotos
  • 51,121
  • 6
  • 32
  • 66
Sree
  • 77
  • 1
  • 8
  • is using the package dplyr an option for this task? – Jan Jun 29 '17 at 08:23
  • 2
    Using dplyr, `df %>% group_by(Date, ID) %>% mutate(new = n())` or with base R, `with(df, ave(Value, Date, ID, FUN = length))` – Sotos Jun 29 '17 at 08:23
  • just found the following link - it may help: https://stackoverflow.com/questions/20275325/add-column-with-counts-of-another – Sree Jun 29 '17 at 08:26
  • There is also this answer which summarizes several methods: https://stackoverflow.com/a/7450633/3817004 – Uwe Jun 29 '17 at 09:30

0 Answers0