0

I am looking to make this table below. I have the first 2 columns of data for 2 years and wanting to calculate the daily mean price and apply it to all relevant days.

Publish_Date Product_Price Daily_Mean
2019-07-01-- 146---------- 142-------
2019-07-01-- 144---------- 142-------
2019-07-01-- 136---------- 142-------
2019-07-02-- 120---------- 123-------
2019-07-02-- 126---------- 123-------
2019-07-02-- 123---------- 123-------
2019-07-03-- 112---------- 112-------

I have tried the following:

TGPDailyMean = aggregate(Product_Price ~ Publish_Date, TGP, mean)

but it only gives one value per day, shortening the amount of rows by a factor of 3 or so. I need the amount of rows to be the same so I can take the difference between another dataframe and Daily_Mean.

I have also tried:

TGP$DailyMean = lapply(TGP$Product_Price, mean)

but this only replicates the values in Product_Price and does not find the mean per day.

nicola
  • 24,005
  • 3
  • 35
  • 56

2 Answers2

1

Tidyverse Solution

You can use group_by and mutate:

library(dplyr)
TPG %>%
  group_by(Publish_Date) %>%
  mutate(Daily_Mean = mean(Product_Price)) %>%
  ungroup()

#> # A tibble: 7 x 3
#>   Publish_Date Product_Price Daily_Mean
#>   <chr>                <int>      <dbl>
#> 1 2019-07-01             146        142
#> 2 2019-07-01             144        142
#> 3 2019-07-01             136        142
#> 4 2019-07-02             120        123
#> 5 2019-07-02             126        123
#> 6 2019-07-02             123        123
#> 7 2019-07-03             112        112

Base R solution

As suggest by @nicola in the comments, you can also use ave:

TPG$Daily_Mean <- ave(TPG$Product_Price, TPG$Publish_Date)
TPG
#>   Publish_Date Product_Price Daily_Mean
#> 1   2019-07-01           146        142
#> 2   2019-07-01           144        142
#> 3   2019-07-01           136        142
#> 4   2019-07-02           120        123
#> 5   2019-07-02           126        123
#> 6   2019-07-02           123        123
#> 7   2019-07-03           112        112
Edo
  • 7,567
  • 2
  • 9
  • 19
0

data.table oneliner

library(data.table)
# or setDT(mydata)
setDT(mydata)[, Daily_Mean2 := mean(Product_Price), by = .(Publish_Date)]
#    Publish_Date Product_Price Daily_Mean Daily_Mean2
# 1:   2019-07-01           146        142         142
# 2:   2019-07-01           144        142         142
# 3:   2019-07-01           136        142         142
# 4:   2019-07-02           120        123         123
# 5:   2019-07-02           126        123         123
# 6:   2019-07-02           123        123         123
# 7:   2019-07-03           112        112         112
Wimpel
  • 26,031
  • 1
  • 20
  • 37