how do you subset a data frame that has multiple duplicate values in R

Question

I have this data frame called data1:

dput(data1)

structure(list(Time = structure(c(1421561100, 1421561100, 1421564700, 
1421564700, 1421568300, 1421568300, 1421571900), class = c("POSIXct", 
"POSIXt"), tzone = "America/New_York"), Server1 = c(0.75, 1, 
0.82, 1, 0.75, 1.08, 0.92)), .Names = c("Time", "Server1"), row.names = c(1L, 
13L, 2L, 14L, 3L, 15L, 4L), class = "data.frame")

As you can see for the same time period, I have multiple data points. I need to modify this data frame and only include the max value for the same time period.

For example, for time period ("2015-01-18 01:05:00"), I see 0.75 and 1.00, I need to only include 1.00 for that time period in. Any ideas how could do this, remove duplicate times and only include the max value?

score 3 · Accepted Answer · answered Jan 26 '15 at 04:06

3

> require(dplyr)
> data1 %>% group_by(Time) %>% filter(Server1 == max(Server1))
Source: local data frame [4 x 2]
Groups: Time

                 Time Server1
1 2015-01-18 01:05:00    1.00
2 2015-01-18 02:05:00    1.00
3 2015-01-18 03:05:00    1.08
4 2015-01-18 04:05:00    0.92

answered Jan 26 '15 at 04:06

Ricky

4,616
6
42
72

1

If there are ties, something like this would be helpful. `filter(group_by(mydf, Time), Server1 == max(Server1, na.rm = TRUE)) %>% distinct()` – jazzurro Jan 26 '15 at 04:17

how do you subset a data frame that has multiple duplicate values in R

1 Answers1