1

So i have this Dataframe that have different variables of interest separated by hour. I want to aggregate those variables with the sum (Amount) by day by variable (Type), but im new at R and im having a hard time to do it. It looks like this:

Date        Hour      Type        ID  Amount   Remaining    Price 1   Price 2   Sell
01/01/2019  12:03:50  Devolution  D   6        -            2,8       2,8       3,19
01/01/2019  12:03:50  Devolution  D   10       -            2,8       2,8       3,19
01/01/2019  13:00:01  Correction  D   354      0            2,8       2,8       3,19
01/01/2019  18:30:20  Client      L   1        -            2,8       2,8       3,19
02/01/2019  02:00:30  Enter       D   36       -            2,2       -         4,29
02/01/2019  08:38:18  Enter       D   10       -            2,2       -         4,29
02/01/2019  19:00:20  Selling     L   1        -99          2,0       2,5       3,59
02/01/2019  22:12:00  Selling     L   3        -            2,0       2,5       3,59
02/01/2019  23:47:52  Selling     L   1        -            2,0       2,5       3,59

I want to transform it to be like this:

Date          Type        ID  Amount   Remaining    Price 1   Price 2   Sell
01/01/2019    Devolution  D   16       -            2,8       2,8       3,19
01/01/2019    Correction  D   354      0            2,8       2,8       3,19
01/01/2019    Client      L   1        -            2,8       2,8       3,19
02/01/2019    Enter       D   46       -            2,2       -         4,29
02/01/2019    Selling     L   4        -99          2,0       2,5       3,59

1 Answers1

0

After grouping by 'Date', 'Type', 'ID', get the sum of 'Amount' and then get the distinct rows

library(dplyr)
df1 %>%
    group_by(Date, Type, ID) %>%
    mutate(Amount = sum(Amount)) %>%
    select(-Hour) %>%
    ungroup %>%
    distinct(Date, Type, ID, .keep_all = TRUE)
# A tibble: 5 x 8
#   Date       Type       ID    Amount Remaining Price1 Price2 Sell 
#   <chr>      <chr>      <chr>  <int> <chr>     <chr>  <chr>  <chr>
#1 01/01/2019 Devolution D         16 -         2,8    2,8    3,19 
#2 01/01/2019 Correction D        354 0         2,8    2,8    3,19 
#3 01/01/2019 Client     L          1 -         2,8    2,8    3,19 
#4 02/01/2019 Enter      D         46 -         2,2    -      4,29 
#5 02/01/2019 Selling    L          5 -99       2,0    2,5    3,59 

data

df1 <- structure(list(Date = c("01/01/2019", "01/01/2019", "01/01/2019", 
"01/01/2019", "02/01/2019", "02/01/2019", "02/01/2019", "02/01/2019", 
"02/01/2019"), Hour = c("12:03:50", "12:03:50", "13:00:01", "18:30:20", 
"02:00:30", "08:38:18", "19:00:20", "22:12:00", "23:47:52"), 
    Type = c("Devolution", "Devolution", "Correction", "Client", 
    "Enter", "Enter", "Selling", "Selling", "Selling"), ID = c("D", 
    "D", "D", "L", "D", "D", "L", "L", "L"), Amount = c(6L, 10L, 
    354L, 1L, 36L, 10L, 1L, 3L, 1L), Remaining = c("-", "-", 
    "0", "-", "-", "-", "-99", "-", "-"), Price1 = c("2,8", "2,8", 
    "2,8", "2,8", "2,2", "2,2", "2,0", "2,0", "2,0"), Price2 = c("2,8", 
    "2,8", "2,8", "2,8", "-", "-", "2,5", "2,5", "2,5"), Sell = c("3,19", 
    "3,19", "3,19", "3,19", "4,29", "4,29", "3,59", "3,59", "3,59"
    )), class = "data.frame", row.names = c(NA, -9L))
akrun
  • 874,273
  • 37
  • 540
  • 662