1

I have two data.frame as the following:

> a <- data.frame(x=c(1,2,3,4,5,6,7,8), y=c(1,3,5,7,9,11,13,15))
> a
  x  y
1 1  1
2 2  3
3 3  5
4 4  7
5 5  9
6 6 11
7 7 13
8 8 15

> b <- data.frame(x=c(1,5,7), z=c(2, 4, 6))
> b
  x z
1 1 2
2 5 4
3 7 6

Then I use "join" for two data.frames:

> c <- join(a, b, by="x", type="left")
> c
  x  y  z
1 1  1  2
2 2  3 NA
3 3  5 NA
4 4  7 NA
5 5  9  4
6 6 11 NA
7 7 13  6
8 8 15 NA

My requirement is to replace the NAs in the Z column by the last None-Na value before the current place. I want the result like this:

> c
  x  y  z
1 1  1  2
2 2  3  2
3 3  5  2
4 4  7  2
5 5  9  4
6 6 11  4
7 7 13  6
8 8 15  6
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
kanpu
  • 251
  • 2
  • 10

1 Answers1

1

This time (if your data is not too large) a loop is an elegant option:

for(i in which(is.na(c$z))){
  c$z[i] = c$z[i-1]
}

gives:

> c
  x  y z
1 1  1 2
2 2  3 2
3 3  5 2
4 4  7 2
5 5  9 4
6 6 11 4
7 7 13 6
8 8 15 6

data:

library(plyr)
a <- data.frame(x=c(1,2,3,4,5,6,7,8), y=c(1,3,5,7,9,11,13,15))
b <- data.frame(x=c(1,5,7), z=c(2, 4, 6))
c <- join(a, b, by="x", type="left")

You might also want to check na.locf in the zoo package.

mts
  • 2,160
  • 2
  • 24
  • 34