1

I have trip data organised by month. It looks like this

source <- c(1,1,2,2)
dest <- c(2,2,1,1)
month <- c(1,2,1,2)
time <- c(0.23, 0.34, 0.29, 0.33)
df <- data.frame(source, dest, month, time)

> df
source   dest    month    time
  1        2        1     0.23
  1        2        2     0.34
  2        1        1     0.29
  2        1        2     0.33

I would like time to become the average time for all months of the trips which have the same source and same destination, like this:

source   dest     time
  1        2      0.285
  2        1      0.31

Here are the steps I'm thinking of:

  • If the pattern of source and dest is not unique
  • time is equal to the mean of time for all month where this pattern is repeated.

Here's what I've tried

df$merged <- paste(df$source, df$dest, sep='')

The trouble is:

  1. I don't know of a logical function that returns TRUE/FALSE values based on whether the row is unique. There's duplicated but that classifies the first instance of a pattern as not duplicated. In my example then, the first and third rows would be classed as not duplicated, while I want them to be classed as such.

  2. I don't how to tell R that, based on the above condition, the rows that are not unique should be associated with their duplicates and the times for those rows should be averaged into another column.

Does anyone know how to do what I'm thinking of?

Ben Mann
  • 173
  • 4
  • 11

0 Answers0