0

I have follow up data for different people, for example for one guy if i have 10 observations, his name will be only on his first row, the 9 following rows will not have name.

My goal is to fill the name column

Here is a reproducible example of my data:

test = data.frame(name = c("Paul",NA,NA,"John",NA,"Ethan",NA,NA),
                  date = c("2016-05-06","2017-05-06","2018-05-06","2012-08-09","2016-02-01","2017-06-06","2017-07-06","2017-08-06"),
                  data = c(1,2,1,NA,2,2,NA,2))

That is how the data looks like :

  name       date data
1  Paul 2016-05-06    1
2  <NA> 2017-05-06    2
3  <NA> 2018-05-06    1
4  John 2012-08-09   NA
5  <NA> 2016-02-01    2
6 Ethan 2017-06-06    2
7  <NA> 2017-07-06   NA
8  <NA> 2017-08-06    2

And my goal is to have that :

  name       date data
1  Paul 2016-05-06    1
2  Paul 2017-05-06    2
3  Paul 2018-05-06    1
4  John 2012-08-09   NA
5  John 2016-02-01    2
6 Ethan 2017-06-06    2
7 Ethan 2017-07-06   NA
8 Ethan 2017-08-06    2

I did not find any function that can replace until the next not NA observation, and for information the data is sorted by person and by date.

BPeif
  • 191
  • 6

1 Answers1

1

One option would be tidyr::fill:

test = data.frame(name = c("Paul",NA,NA,"John",NA,"Ethan",NA,NA),
                  date = c("2016-05-06","2017-05-06","2018-05-06","2012-08-09","2016-02-01","2017-06-06","2017-07-06","2017-08-06"),
                  data = c(1,2,1,NA,2,2,NA,2))

tidyr::fill(test, name)
#>    name       date data
#> 1  Paul 2016-05-06    1
#> 2  Paul 2017-05-06    2
#> 3  Paul 2018-05-06    1
#> 4  John 2012-08-09   NA
#> 5  John 2016-02-01    2
#> 6 Ethan 2017-06-06    2
#> 7 Ethan 2017-07-06   NA
#> 8 Ethan 2017-08-06    2
stefan
  • 90,330
  • 6
  • 25
  • 51