0

I have a list of dates, and each date has a value.

This is what my data frame looks like right now. Note that there can be repeats in the date, but the entry in value will also repeat with the same value (i.e. row 2 and 3 have the same date, but the respective values are also the same).

  date         value
1 2018-02-08   1
2 2018-02-09   2
3 2018-02-09   2
4 2018-02-10   4
  ...          ...

This is what I want my data frame to look like

     date         value  weekavg
   1 2018-02-08    1     ...
   2 2018-02-09    2     ...
   3 2018-02-09    2     ...
   4 2018-02-10    4     ...
   5 2018-02-11    0     ...
   6 2018-02-12    0     ...
   7 2018-02-13    0     ...
   8 2018-02-14    0     ...
   9 2018-02-15    0     1
     ...           ...   ...

To clarify, the entry in the ninth row is calculated by finding the dates that occurred before it for a week, so for 2018-02-15 that would be the date range 2018-02-08 to 2018-02-13. Thus, the result is 1 since 1+2+4+0+0+0+0 = 7. How could I do this in R, and then do it for every row?

------ Reproducible example -----

data

lines <-    "date      value
        1   2018-02-08    NA
        2   2018-02-08    NA
        3   2018-02-09    NA
        4   2018-02-10   295
        5   2018-02-10   295
        6   2018-02-11   329
        7   2018-02-12   242
        8   2018-02-12   242
        9   2018-02-13   317
        10  2018-02-14   341
        11  2018-02-15   292
        12  2018-02-16   363
        13  2018-02-17   380
        14  2018-02-18   319
        15  2018-02-19   307
        16  2018-02-20   328
        17  2018-02-21   290"

df <- read.table(text = lines)
newDF <- merge(df, transform(unique(df), mean = rollmeanr(value, 7, fill = NA)))

the mean column is just NA's for me.

P.S. Apologies for the image comments, I didn't know. Your help is much appreciated.

rdk
  • 439
  • 3
  • 14
  • 1
    The `zoo` package as written for tasks like this, specifically `zoo::rollapply`. – r2evans Aug 06 '18 at 19:12
  • Been looking at the zoo package for the last 35 minutes, can't figure out how to use it for my problem correctly. Any way you could try to answer the question? I looked but didn't find anything regarding zoo that was similar enough to my problem to help me. struggling here :/ – rdk Aug 06 '18 at 19:49

1 Answers1

1

The question does not fully define the output but assuming:

  • there are no missing days, only duplicated days
  • if a day is duplicated then the average on its row should be duplicated

then:

library(zoo)

merge(DF, transform(unique(DF), mean = rollmeanr(value, 7, fill = NA)))

For the sample data shown reproducibly in the Note at the end this gives:

        date value      mean
1 2018-02-08     1        NA
2 2018-02-09     2        NA
3 2018-02-09     2        NA
4 2018-02-10     4        NA
5 2018-02-11     0        NA
6 2018-02-12     0        NA
7 2018-02-13     0        NA
8 2018-02-14     0 1.0000000
9 2018-02-15     0 0.8571429

Note

Lines <- "
     date         value 
   1 2018-02-08    1 
   2 2018-02-09    2  
   3 2018-02-09    2 
   4 2018-02-10    4    
   5 2018-02-11    0 
   6 2018-02-12    0 
   7 2018-02-13    0 
   8 2018-02-14    0    
   9 2018-02-15    0
"
DF <- read.table(text = Lines)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • thanks! However, though I was able to reproduce your example, for some reason, the mean column for my real data is just full of NA's. Any thoughts? – rdk Aug 06 '18 at 20:26
  • You will need to provide a reproducible example. – G. Grothendieck Aug 06 '18 at 20:31
  • hoping this helps https://imgur.com/0GZa7pQ https://imgur.com/S7OE5or the entire mean column is NA's – rdk Aug 06 '18 at 20:34
  • 3
    Do not post images of code or data. A reproducible example should be done in text and in the question, not in a comment, such that we can just copy your text and paste it into R to see the claimed result. – G. Grothendieck Aug 06 '18 at 20:39
  • 2
    Do you really think it is acceptable to ask people helping you to *transcribe* data from an image? It would be much easier to **copy something from your console**. From [a handful of recommendations](https://stackoverflow.com/questions/5963269), I suggest: `dput(head(x))` or `read.table(text=...)` like @G.Grothendieck used in his answer. – r2evans Aug 06 '18 at 20:39
  • 1
    I apologize for sounding angry or such. Speaking for myself, I'm often doing this during a coffee break or similar, so things that require significantly more time can be frustrating. "Asking good questions" is not an obvious skill, I know I've had to develop it myself. The best place to include sample data is in the original question itself: comments do not do formatting well, and nothing else is an appropriate venue here. – r2evans Aug 06 '18 at 20:58