0

Quite a simple task I guess... I am trying to calculate the mean price per day. There are 3 different days here and each has some price. This is the DataFrame I initially have

 ID       Date      RoomAv    Price
  1    2001-01-02    TRUE      110
  2    2001-01-04    FALSE     120
  3    2001-01-03    TRUE      130
  4    2001-01-03    TRUE      140
  5    2001-01-03    TRUE      150
  6    2001-01-02    FALSE     160
  7    2001-01-02    TRUE      170
  8    2001-01-04    TRUE      180
  9    2001-01-04    FALSE     190
 10    2001-01-02    TRUE      200

I need it to be something like this

    Date      AveragePrice
 2001-01-02       num1
 2001-01-03       num2
 2001-01-04       num3

This is what I tried to do

df <- DataFrame %>%
  group_by(DataFrame$Date) %>%
  summarize(DataFrame$price == mean(DataFrame$Price))

and I got:

Error: Column `DataFrame$price == mean(DataFrame$Price)` must be length 1 (a summary value), not 0

Have not used the data.table library but would like to hear how it's possible there.

camille
  • 16,432
  • 18
  • 38
  • 60
Ron
  • 65
  • 1
  • 6

5 Answers5

4

An option with data.table

library(data.table)
setDT(df)[, .(Price = mean(Price), by = Date]
akrun
  • 874,273
  • 37
  • 540
  • 662
2

You can do something like

Using dplyr

df <- DataFrame %>%
  group_by(Date) %>%
  mutate(price == mean(Price))

Using data.table

df <- DataFrame[, mean(Price),.(Date)]
YOLO
  • 20,181
  • 5
  • 20
  • 40
2

You can use aggregate() from base R to make it:

dfout <- aggregate(Price ~Date, df, mean)

such that

> dfout
        Date    Price
1 2001-01-02 160.0000
2 2001-01-03 140.0000
3 2001-01-04 163.3333

DATA

df <- structure(list(ID = 1:10, Date = c("2001-01-02", "2001-01-04", 
"2001-01-03", "2001-01-03", "2001-01-03", "2001-01-02", "2001-01-02", 
"2001-01-04", "2001-01-04", "2001-01-02"), RoomAv = c(TRUE, FALSE, 
TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE), Price = c(110L, 
120L, 130L, 140L, 150L, 160L, 170L, 180L, 190L, 200L)), class = "data.frame", row.names = c(NA, 
-10L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

Remember that in R == is used to test if some value is equal to another, as x == 1. Thus, you should assign the new variable in summarize with =. Here is the correct version.

library(dplyr)
DataFrame %>%
  group_by(Date) %>%
  summarize(avrgPrice = mean(Price))
0

Thanks, Actually I found this method as the shortest:

dfMean <- aggregate(DataFrame$Price ~ DataFrame$Date, DataFrame, mean)
Ron
  • 65
  • 1
  • 6