0

I have the following problem. I have a data frame that looks like this:

Date       Period  Subject    Skill
1.1.2020     1        1           1
1.1.2020     1        2           3
1.1.2020     1        3           1 
1.1.2020     1        4           3 
1.1.2020     2        1           NA
1.1.2020     2        2           NA
1.1.2020     2        3           NA 
1.1.2020     2        4           NA 
1.1.2020     3        1           NA
1.1.2020     3        2           NA
1.1.2020     3        3           NA 
1.1.2020     3        4           NA 
1.1.2021     1        1           4
1.1.2021     1        2           4
1.1.2021     1        3           1 
1.1.2021     1        4           3 
1.1.2021     2        1           NA
1.1.2021     2        2           NA
1.1.2021     2        3           NA
1.1.2021     2        4           NA 
1.1.2021     3        1           NA
1.1.2021     3        2           NA
1.1.2021     3        3           NA
1.1.2021     3        4           NA 

I would like to copy the Skill value from Period 1 for each subject that took the test on the same date (in this case representing a unique group) to the same subject in Period 2 and Period 3. The desired output should look like this:

Date       Period  Subject    Skill
1.1.2020     1        1           1
1.1.2020     1        2           3
1.1.2020     1        3           1 
1.1.2020     1        4           3 
1.1.2020     2        1           1
1.1.2020     2        2           3
1.1.2020     2        3           1
1.1.2020     2        4           3
1.1.2020     3        1           1
1.1.2020     3        2           3
1.1.2020     3        3           1
1.1.2020     3        4           3
1.1.2021     1        1           4
1.1.2021     1        2           4
1.1.2021     1        3           1 
1.1.2021     1        4           3 
1.1.2021     2        1           4
1.1.2021     2        2           4
1.1.2021     2        3           1
1.1.2021     2        4           3
1.1.2021     3        1           4
1.1.2021     3        2           4
1.1.2021     3        3           1
1.1.2021     3        4           3

I saw a similar thread How to copy value of a cell to other rows based on the value of other two columns? however there is just one missing NA value per group (and not many NAs per group). With the following solution from zoo package I am only replacing each NA with the most recent non-NA prior to it.

x %>% group_by(Date, Subject) %>% mutate(Skill= na.locf(Skill))

In short, I want to match the value of a cell to other rows based on the same Subject ID within a group.

Thanks for all your answers!

Nad Pat
  • 3,129
  • 3
  • 10
  • 20

2 Answers2

1

Libraries

library(tidyverse)

Data

df <-
  structure(list(Date = structure(c(18262, 18262, 18262, 18262, 
                                    18262, 18262, 18262, 18262, 18262, 18262, 18262, 18262, 18628, 
                                    18628, 18628, 18628, 18628, 18628, 18628, 18628, 18628, 18628, 
                                    18628, 18628), class = "Date"), Period = c(1L, 1L, 1L, 1L, 2L, 
                                                                               2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
                                                                               3L, 3L, 3L), Subject = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 
                                                                                                        2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), 
                 Skill = c(1L, 3L, 1L, 3L, NA, NA, NA, NA, NA, NA, NA, NA, 
                           4L, 4L, 1L, 3L, NA, NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
                                                                                                                 -24L))

Code

df %>% 
  group_by(Date,Subject) %>% 
  fill(Skill) 

Output

# A tibble: 24 x 4
# Groups:   Date, Subject [8]
   Date       Period Subject Skill
   <date>      <int>   <int> <int>
 1 2020-01-01      1       1     1
 2 2020-01-01      1       2     3
 3 2020-01-01      1       3     1
 4 2020-01-01      1       4     3
 5 2020-01-01      2       1     1
 6 2020-01-01      2       2     3
 7 2020-01-01      2       3     1
 8 2020-01-01      2       4     3
 9 2020-01-01      3       1     1
10 2020-01-01      3       2     3
Vinícius Félix
  • 8,448
  • 6
  • 16
  • 32
0

How about just taking the max or min over the date and subject?

Quixotic22
  • 2,894
  • 1
  • 6
  • 14