-1

I have a dataframe, which looks like this:

Date                             Price     Type
2018-08-17 10:48:02              120        A
2018-08-17 10:55:02              100        A
2018-08-17 10:57:02              70         B
2018-08-17 10:58:07              69         B
2018-08-20 22:58:13              56         A
2018-08-21 04:19:53              79         A
2018-08-21 04:29:56              40         A
2018-08-21 09:15:07              11         B
2018-08-21 17:07:03              600        A
2018-08-21 17:57:11              225        C

I want to transform Date column, so it shows only day and hour and for values in column "Price", they must sum for each hour depending on column Type. So, desired result must look like this:

    Date                  Price_sum        Type
2018-08-17 10:00:00         220             A
2018-08-17 10:00:00         139             B
2018-08-20 22:00:00         56              A
2018-08-21 04:00:00         119             A
2018-08-21 09:00:00         11              B
2018-08-21 17:00:00         600             A
2018-08-21 17:00:00         225             C

How could i get it? I have no idea

french_fries
  • 1,149
  • 6
  • 22

2 Answers2

1

We can use floor_date from lubridate to floor the time at nearest hour and sum those values.

library(dplyr)
library(lubridate)

df %>%
  group_by(Date = floor_date(ymd_hms(Date), 'hour'), Type) %>%
  summarise(Price_sum = sum(Price), 
            n = n())

#  Date                Type  Price_sum     n
#  <dttm>              <chr>     <int> <int>
#1 2018-08-17 10:00:00 A           220     2
#2 2018-08-17 10:00:00 B           139     2
#3 2018-08-20 22:00:00 A            56     1
#4 2018-08-21 04:00:00 A           119     2
#5 2018-08-21 09:00:00 B            11     1
#6 2018-08-21 17:00:00 A           600     1
#7 2018-08-21 17:00:00 C           225     1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Another way you can try

library(dplyr)
library(lubridate)
df %>% 
  mutate(Date = ymd_hms(Date),
         hour_only = hour(Date)) %>% 
  group_by(Type, hour_only) %>% 
  mutate(Price2 = sum(Price)) %>% 
  slice(1) %>% 
  ungroup() %>% 
  arrange(Date) %>% 
  select(Date, Price2, Type)
# Date                Price2 Type 
# <dttm>               <int> <chr>
# 1 2018-08-17 10:48:02    220 A    
# 2 2018-08-17 10:57:02    139 B    
# 3 2018-08-20 22:58:13     56 A    
# 4 2018-08-21 04:19:53    119 A    
# 5 2018-08-21 09:15:07     11 B    
# 6 2018-08-21 17:07:03    600 A    
# 7 2018-08-21 17:57:11    225 C  
Tho Vu
  • 1,304
  • 2
  • 8
  • 20