1

I have a data frame with the year (x) and an associated percentage (y)

data.frame(x = c(1997,2000,2003,2006,2009,2010,2013,2014),
           y = c(.02,.023,.025,.024,.026,.027,.029,.031)

Here is a line chart of this data frame :

x~y

I would like to interpolate my data to get the percentage of missing years based on a linear regression.

I could make a linear model of each piece of curve but it would be tedious.

Is there a simple way to do it with R?

INPUT :

df = data.frame(
  year=c(1997,2000,2003,2006,2009,2010,2013,2014),
  percent=c(0.020, 0.023, 0.025, 0.024, 0.026, 0.027, 0.029, 0.031)
)

OUTPUT (for a function f) :

f(2006)==0.024
f(2007)==0.024.666
f(2008)==0.025.333
f(2009)==0.026
Axeman
  • 32,068
  • 8
  • 81
  • 94
Dan Chaltiel
  • 7,811
  • 5
  • 47
  • 92

1 Answers1

5

One way is to use linear inter polation with zoo:

library(tidyr)
library(zoo)

df_complete <- complete(df, year = full_seq(year, 1))
df_complete$percent <- na.approx(df_complete$percent)


plot(df_complete)

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94