0

considering the following data.frame I would like to calculate the mean between 2011-01-03 and 2011-01-06:

             GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume
2011-01-03    297.94    302.49   297.94     301.87          NA
2011-01-04    302.51    302.79   299.76     300.76          NA
2011-01-05    299.73    304.86   299.72     304.23          NA
2011-01-06    305.03    308.91   304.72     306.44          NA

The result of the code mean(data$GOOG.Open, seq(from=01/03/11, to=01/06/11)) gives me 529.8661 and is actually referencing to different values in the Data Frame. Do you know how to calculate the mean?

Cœur
  • 37,241
  • 25
  • 195
  • 267

1 Answers1

0

First you need to define how you have your data stored see: How to make a great R reproducible example?

I'm using dplyr within the tidyverse package to analyse the data and lubridate to define the date formats. This assumes that you want to be able to vary the dates averaged.

library(tidyverse)
library(lubridate)

dat <- data.frame(date = c('2011-01-03','2011-01-04','2011-01-05','2011-01-06'), 
                  GOOG.Open = c(297.94,302.51,299.73,305.03))
dat %>% 
    mutate(date = format(ymd(date))) %>% 
    filter(date>='2011-01-03' & date<='2011-01-06') %>% 
    summarise(goog_mean = mean(GOOG.Open))

If you just want the mean of the data presented you can use:

mean(dat$GOOG.Open) 

or

dat %>% 
    summarise(mean = mean(GOOG.Open))
B Williams
  • 1,992
  • 12
  • 19
  • Nice, thanks for the hint. I am downloading the data from google and the "dates" are actually the row names. Do you have an idea how I can reference to these dates, if they are not a column but rownames? – aleximeyer Sep 19 '17 at 23:01