1

I have this data set:

yDF = structure(list(Date = structure(c(1L,2L,3L,4L,5L,6L,7L,9L,9L,10L,11L), .Label = c("3/31/2018","4/1/2018", "4/2/2018", "4/3/2018", "4/4/2018", 
                                                                                        "4/5/2018", "4/6/2018", "4/8/2018", "4/8/2018", "4/9/2018","4/10/2018"), class = "factor"), 
                     Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                         2L), .Label = c("Decrease", "Increase"), class = "factor"), 
                     ID = c(5002, 5002, 5002, 5002, 5002, 5002, 5002, 5002, 
                            5002, 5002, 5002), Week = c(13, 13, 14, 14, 14, 14, 14, 
                                                        14, 14, 15, 15)), row.names = 1:10, class = "data.frame")

Which should look like this:


    Date        Group       ID      Week
1   3/31/2018   Increase    5002    13
2   4/1/2018    Increase    5002    13
3   4/2/2018    Increase    5002    14
4   4/3/2018    Increase    5002    14
5   4/4/2018    Increase    5002    14
6   4/5/2018    Increase    5002    14
7   4/6/2018    Increase    5002    14
8   4/8/2018    Increase    5002    14
9   4/8/2018    Increase    5002    14
10  4/9/2018    Increase    5002    15

I would like to add a new column called "session,"

  • where a row is labeled "pre" if the ID column and the week column are new OR if they are the same value and within 2 weeks of the original value.
  • If the same ID (like 5002) has a week number that is "original week + 2 weeks or more" (like Week 15) then it should be labeled as "post."

I have played with if else functions but cannot get the correct output. Ideally it will look like this:

(keep in mind I have over 100 unique subject ID's)

```
        Date    Group   ID          Week   Session
30 3/31/2018 Increase 5002           13    Pre
31  4/1/2018 Increase 5002           13    Pre
32  4/2/2018 Increase 5002           14    Pre
33  4/3/2018 Increase 5002           14    Pre
34  4/4/2018 Increase 5002           14    Pre
35  4/5/2018 Increase 5002           14    Pre
36  4/6/2018 Increase 5002           14    Pre
37  4/8/2018 Increase 5002           14    Pre
38  4/8/2018 Increase 5002           14    Pre
39  4/9/2018 Increase 5002           15    Post
40 4/10/2018 Increase 5002           15    Post
```
CanyonView
  • 401
  • 3
  • 15
  • 2
    would be great if you could give us the code to reproduce this data. – Sachin PC Jul 17 '19 at 23:56
  • How can I add the code? a subset is pasted in the question to copy and paste as a data frame – CanyonView Jul 18 '19 at 00:10
  • @shai73 use `dput(head(your_df, 10))` and add it's output to your code. Use `dput(your_df[30:40, ])` if you specifically want rows 30 to 40 as shown in your post. – Shree Jul 18 '19 at 00:17
  • Thanks but it doesn't seem to work. Is there another option? I tried both suggestions and it keeps structuring the entire CSV file, not just first 10 or 30:40 – CanyonView Jul 18 '19 at 00:35
  • See [how to make a reproducible R example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – camille Jul 18 '19 at 00:55
  • I manually fixed the dput output. Please let me know if this is suitable – CanyonView Jul 18 '19 at 01:18

1 Answers1

0

We can use mutate, ifelse, and first to achieve this task.

library(tidyverse)

yDF2 <- yDF %>%
  group_by(ID) %>%
  mutate(Session = ifelse(Week <= first(Week) + 1, "Pre", "Post")) %>%
  ungroup()
yDF2
# # A tibble: 10 x 5
#    Date      Group       ID  Week Session
#    <chr>     <chr>    <int> <int> <chr>  
#  1 3/31/2018 Increase  5002    13 Pre    
#  2 4/1/2018  Increase  5002    13 Pre    
#  3 4/2/2018  Increase  5002    14 Pre    
#  4 4/3/2018  Increase  5002    14 Pre    
#  5 4/4/2018  Increase  5002    14 Pre    
#  6 4/5/2018  Increase  5002    14 Pre    
#  7 4/6/2018  Increase  5002    14 Pre    
#  8 4/8/2018  Increase  5002    14 Pre    
#  9 4/8/2018  Increase  5002    14 Pre    
# 10 4/9/2018  Increase  5002    15 Post  

DATA

yDF <- read.table(text = "    Date        Group       ID      Week
1   '3/31/2018'   Increase    5002    13
2   '4/1/2018'    Increase    5002    13
3   '4/2/2018'    Increase    5002    14
4   '4/3/2018'    Increase    5002    14
5   '4/4/2018'    Increase    5002    14
6   '4/5/2018'    Increase    5002    14
7   '4/6/2018'    Increase    5002    14
8   '4/8/2018'    Increase    5002    14
9   '4/8/2018'    Increase    5002    14
10  '4/9/2018'    Increase    5002    15",
                  stringsAsFactors = FALSE, header = TRUE)
www
  • 38,575
  • 12
  • 48
  • 84