0

I am new in R, and I have a problem. I have a CSV data frame with more than 80000 entries. I have a column (maj) filled with 0 and 1, a column with each day of the year, and a column with the price per day (and other columns). When maj = 1 it means that an update on the price has been done that day. What I want to do is : If maj = 0 during the last 30 days, price has to be replaced by "N/A"

Here's a sample of my df :

      day       maj     price
   2019-01-02    1      1435
   2019-01-03    0      1435
   2019-01-04    0      1435
   2019-01-05    0      1435

For example, if between the 2019-01-03 and the 2019-02-03 maj = 0, I want to replace the price by N/A for the 2019-02-04 and all the following, until maj=1 again.

I don't have any code to show because I erased it when I saw that nothing was working. I tried rollapplyr with the zoo package, it created a function and values to roll monthly sum, but I don't understand it.

Does anyone know how to do it ?

Thanks,

marie_mrc
  • 3
  • 4
  • Welcome to SO! Could you make your problem reproducible by sharing a sample of your data and the code you're working on so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Apr 12 '19 at 14:58
  • Hi, I tried to paste my clipboard to show a sample of the df, because I didn't manage to make the packages work. – marie_mrc Apr 12 '19 at 15:40
  • I don't know if it suits your case but maybe it's worth to load your data in a SQLite database. https://db.rstudio.com/databases/sqlite/ – Christos Karapapas Apr 12 '19 at 16:10
  • Please, use the `dput` command to yield your data: `dput(df)`. – Rafael Toledo Apr 15 '19 at 12:09
  • Hi, @RafaelToledo I tried to do it, but it doesn't work at all, my data frame appears but nothing is written correctly, everything is replaced by numbers, and by the letter "L". I don't know why. I'm sorry. – marie_mrc Apr 15 '19 at 12:51
  • That's the right behavior, so you can copy and paste this outcome in your question. For more information how to do it, check this [answer](https://stackoverflow.com/a/5963610/6509883). – Rafael Toledo Apr 15 '19 at 12:55

1 Answers1

0

Using DF shown reproducibly in the Note at the end use rollapplyr to return TRUE if there are any 1's in the last n days and FALSE otherwise. Then use ifelse to convert TRUE to 1 and FALSE to NA. The question did not specify how to handle the first n-1 elements but below we fill them with 1s. (Alternatives would be fill=NA or partial=TRUE where this last alternative will apply any to whatever number of elements there are if there are fewer than n.

library(zoo)

# n <- 30
n <- 3

transform(DF, price = price * ifelse(rollapplyr(maj, n, any, fill = 1), 1, NA))

giving:

         day maj price
1 2019-01-02   1  1435
2 2019-01-03   0  1435
3 2019-01-04   0  1435
4 2019-01-05   0    NA

Note

Lines <- "day       maj     price
2019-01-02    1      1435
2019-01-03    0      1435
2019-01-04    0      1435
2019-01-05    0      1435"
DF <- read.table(text = Lines, header = TRUE, strip.white = TRUE)
DF$day <- as.Date(DF$day)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341