0

My data looks like this

> data
   ID Price
1   1     1
2   2     3
3   3    NA
4   4    NA
5   5     7
6   6     6
7   7    NA
8   8    NA
9   9    NA
10 10    10

I want to extrapolate values by the last value available so that my data looks like this

> data_final
   ID Price
1   1     1
2   2     3
3   3     3
4   4     3
5   5     7
6   6     6
7   7     6
8   8     6
9   9     6
10 10    10

Any help would be greatly appreciated

Rajarshi Bhadra
  • 1,826
  • 6
  • 25
  • 41
  • See -e.g.- [here](http://stackoverflow.com/questions/22693173/imputing-missing-values-linearly-in-r), [here](http://stackoverflow.com/questions/7188807/interpolate-na-values) and the arguments to `?approx` – alexis_laz Apr 08 '16 at 12:03
  • Hmm which method is robust against a NA at the first location – chinsoon12 Apr 08 '16 at 12:14

3 Answers3

4

We can use na.locf

library(zoo)
data$Price <- na.locf(data$Price)
data$Price
#[1]  1  3  3  3  7  6  6  6  6 10
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You can use fill in tidyr

library(tidyr)
> fill(df, Price)
ID Price
1   1     1
2   2     3
3   3     3
4   4     3
5   5     7
6   6     6
7   7     6
8   8     6
9   9     6
10 10    10
Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
0

Here is a method in base R:

# construct data.frame
data <- data.frame("ID"=1:10, "Price"=c(1,3,NA,NA,7,6,NA,NA,NA,10))
# loop through vector iteratively replacing subsets of missing values
while(any(is.na(data$Price))) {
  data[, "Price"][is.na(data$Price)] <- data[, "Price"][which(is.na(data$Price))-1]
lmo
  • 37,904
  • 9
  • 56
  • 69