-1

Here's what I would like to a achieve as a function in Excel, but I can't seem to find a solution to do it in R.

This is what I tried to do but it does not seem to allow me to operate with the previous values of the new column I'm trying to make.

Here is a reproducible example:

library(dplyr)
set.seed(42)  ## for sake of reproducibility

dat <- data.frame(date=seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"))

This would be the output of the dataframe:

dat
        date
1 2020-12-26
2 2020-12-27
3 2020-12-28
4 2020-12-29
5 2020-12-30
6 2020-12-31

Desired output:

        date  periodNumber
1 2020-12-26  1
2 2020-12-27  2
3 2020-12-28  3
4 2020-12-29  4
5 2020-12-30  5
6 2020-12-31  6

My try at this:

dat %>% 
  mutate(periodLag = dplyr::lag(date)) %>% 
  mutate(periodNumber = ifelse(is.na(periodLag)==TRUE, 1, 
                            ifelse(date == periodLag, dplyr::lag(periodNumber), (dplyr::lag(periodNumber) + 1))))

Excel formula screenshot:

1

Nimantha
  • 6,405
  • 6
  • 28
  • 69
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please explain in words what you are trying to accomplish rather than just asking for code to be translated from one language to another. That will allow more people to help you. – MrFlick Nov 03 '21 at 07:27
  • in your second mutate expression you are trying to set `periodNumber` to `lag(periodNumber)` (which doesn't exist yet). Maybe you wanted to use `periodLag`? (i.e. you are trying to create a **new** column A with `A = ifelse(some_test, lag(A), lag(A) + 1)` – dario Nov 03 '21 at 07:37
  • I would love to get the period number, not the date of the period. Therefore, I am trying to check if the current `date` equals the previous `date`, and in case it does, then use the same `periodNumber` as in in the previous row. – freakingcloud Nov 03 '21 at 07:42
  • `data.table::rleid(dat$date)` – Wimpel Nov 03 '21 at 07:43
  • But `periodNumber` does not exist yet (at the time when you are trying to use it in the `ifelse`). *"I would love to get the period number, not the date of the period"* - In that case you need two separat `ifelse`, where the first `ìfelse` "initializes the column `periodNumber`... Generally, you can't assign a variable that does not exist to itself... – dario Nov 03 '21 at 07:43
  • Could you please elaborate more on what you meant with initializing the column `periodNumber` ? – freakingcloud Nov 03 '21 at 07:47
  • Initialize -> set it to a value, will it into existence, make it a variable... Not sure what's there to explain. But I'd suggest stepping (mentally) through your code above and ask yourself: Does the value I am trying to use exist? And if it does not: How can you make it exist?... – dario Nov 03 '21 at 07:51
  • Thank you, makes sense now. Mutated the periodNumber column by seperating the other ifelse. – freakingcloud Nov 03 '21 at 07:53
  • That looks much better – dario Nov 03 '21 at 07:54
  • Do you maybe see why it lets me run the code but does not fully execute it? (the + is open in the terminal when I run it) ? dat %>% mutate(periodLag = dplyr::lag(date)) %>% mutate(periodNumber = ifelse(is.na(periodLag)==TRUE, 1, 0) %>% mutate(Period = ifelse(date == periodLag, dplyr::lag(periodNumber), dplyr::lag(periodNumber) + 1)) – freakingcloud Nov 03 '21 at 07:54
  • check your parentheses... there is one missing – dario Nov 03 '21 at 08:02

1 Answers1

0

You could use dplyr's cur_group_id():

library(dplyr)
set.seed(42)

# I used a larger example
dat <- data.frame(date=sample(seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"), size = 30, replace = TRUE))

dat %>% 
  arrange(date) %>% # needs sorting because of the random example
  group_by(date) %>% 
  mutate(periodNumber = cur_group_id())

This returns

# A tibble: 30 x 2
# Groups:   date [6]
   date       periodNumber
   <date>            <int>
 1 2020-12-26            1
 2 2020-12-26            1
 3 2020-12-26            1
 4 2020-12-26            1
 5 2020-12-26            1
 6 2020-12-26            1
 7 2020-12-26            1
 8 2020-12-26            1
 9 2020-12-27            2
10 2020-12-27            2
11 2020-12-27            2
12 2020-12-27            2
13 2020-12-27            2
14 2020-12-27            2
15 2020-12-27            2
16 2020-12-28            3
17 2020-12-28            3
18 2020-12-28            3
19 2020-12-29            4
20 2020-12-29            4
21 2020-12-29            4
22 2020-12-29            4
23 2020-12-29            4
24 2020-12-29            4
25 2020-12-30            5
26 2020-12-30            5
27 2020-12-30            5
28 2020-12-30            5
29 2020-12-30            5
30 2020-12-31            6
Martin Gal
  • 16,640
  • 5
  • 21
  • 39