-1

I have a dataframe:

Date                      ID     Type    Value
2020-08-04 03:00:00        1    active     14
2020-08-04 03:00:00        1    active     15
2020-08-04 03:00:00        2    active     16
2020-08-04 03:00:00        2    passive     17

I want to remove rows which has same values in columns Date ID Type. So desired result is:

Date                      ID     Type    Value
2020-08-04 03:00:00        1    active     14
2020-08-04 03:00:00        2    active     16
2020-08-04 03:00:00        2    passive     17

As you see, second row disappeared. How could i do that?

3 Answers3

0

I would suggest creating a global id like this with paste() and then use duplicated():

#Code
mdf[duplicated(mdf$Date,mdf$ID,mdf$Type,fromLast = F),]

Output:

                Date ID    Type Value
2 04/08/2020 3:00:00  1  active    15
3 04/08/2020 3:00:00  2  active    16
4 04/08/2020 3:00:00  2 passive    17

Some data used:

#Data
mdf <- structure(list(Date = c("04/08/2020 3:00:00", "04/08/2020 3:00:00", 
"04/08/2020 3:00:00", "04/08/2020 3:00:00"), ID = c(1L, 1L, 2L, 
2L), Type = c("active", "active", "active", "passive"), Value = 14:17), row.names = c(NA, 
-4L), class = "data.frame")
Duck
  • 39,058
  • 13
  • 42
  • 84
0

If your goal is to keep the minimum value for a given ID, you can use this dplyr solution:

mdf %>% 
  group_by(Date, ID, Type) %>% 
  mutate(Value = min(Value)) %>% 
  unique()

Which gives us:

  Date                  ID Type    Value
  <chr>              <int> <chr>   <int>
1 04/08/2020 3:00:00     1 active     14
2 04/08/2020 3:00:00     2 active     16
3 04/08/2020 3:00:00     2 passive    17
Matt
  • 7,255
  • 2
  • 12
  • 34
0

Using dplyr

tble = read.table(text='
S.no Date                      ID     Type    Value
1 2020-08-04 03:00:00        1    active     14
2 2020-08-04 03:00:00        1    active     15
3 2020-08-04 03:00:00        2    active     16
4 2020-08-04 03:00:00        2    passive     17')

library(dplyr)

tble %>% distinct(Date, ID, Type, .keep_all=TRUE)
#>         S.no     Date ID    Type Value
#> 1 2020-08-04 03:00:00  1  active    14
#> 3 2020-08-04 03:00:00  2  active    16
#> 4 2020-08-04 03:00:00  2 passive    17

Created on 2020-09-04 by the reprex package (v0.3.0)

monte
  • 1,482
  • 1
  • 10
  • 26