-1

I have a data.table, I want to use the Time column to merge data to reduce data.

library(data.table)
DT <- data.table(ID=c("A","A","A","B","B","C","C","C","C","D"),
                 Time=c("2019-01-16 15:52:03","2019-01-16 16:01:04","2019-01-26 01:22:54",
                       "2019-02-18 17:00:08","2019-02-18 17:05:44",
                       "2019-03-16 13:23:42","2019-03-16 15:52:03","2019-06-04 12:01:04","2019-06-04 16:20:54",
                       "2019-03-16 13:23:42"),
                 place=c("Vienna","France","Berlin","Rome","Washington",
                         "Bangkok","Ottawa","Tokyo","SouthKorea","Singapore"))

If at the day, the same day and the same Id will be merged.

Different days, there is no need to merge

output:

     ID      Time           place
 1    A   2019-01-16    Vienna-France
 2    A   2019-01-26        Berlin
 3    B   2019-02-18    Rome-Washington
 4    C   2019-03-16    Bangkok-Ottawa
 5    C   2019-06-04    Tokyo-SouthKorea
 6    D   2019-03-16      Singapore 

What should I do? Thank you.

kolinunlt
  • 349
  • 1
  • 11

3 Answers3

5

I see you prefer data.table (for that see the post from @January), however, here is one dplyr solution:

DT %>%
 group_by(ID, Time = as.Date(Time, format = "%Y-%m-%d")) %>%
 summarise(place = paste(place, collapse = "-"))

  ID    Time       place           
  <chr> <date>     <chr>           
1 A     2019-01-16 Vienna-France   
2 A     2019-01-26 Berlin          
3 B     2019-02-18 Rome-Washington 
4 C     2019-03-16 Bangkok-Ottawa  
5 C     2019-06-04 Tokyo-SouthKorea
6 D     2019-03-16 Singapore   
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
3

EDIT: I just noticed that it was supposed to be grouped by ID as well.

 DT[ , by = .(ID, as.Date(Time, "%Y-%m-%d")), .(place=paste(place, collapse="-")) ]

   ID    as.Date            place
1:  A 2019-01-16    Vienna-France
2:  A 2019-01-26           Berlin
3:  B 2019-02-18  Rome-Washington
4:  C 2019-03-16   Bangkok-Ottawa
5:  C 2019-06-04 Tokyo-SouthKorea
6:  D 2019-03-16        Singapore
January
  • 16,320
  • 6
  • 52
  • 74
2

You can use also base R:

aggregate(place ~ ID + as.Date(Time) , DT, paste0,collapse = '-')

  ID as.Date(Time)            place
1  A    2019-01-16    Vienna-France
2  A    2019-01-26           Berlin
3  B    2019-02-18  Rome-Washington
4  C    2019-03-16   Bangkok-Ottawa
5  D    2019-03-16        Singapore
6  C    2019-06-04 Tokyo-SouthKorea
s__
  • 9,270
  • 3
  • 27
  • 45