We can use functions from dplyr
, tidyr
, and imputeTS
packages.
First, let's create a data frame similar to your example.
# Create example data frame
dt <- data_frame(Order = c(rep(1, 3), rep(2, 4)),
OrderSteps = c(1:3, 1:4),
InputQuantity = c(300.56, NA, NA, 65.89, NA, NA, NA),
OutputQuantity = c(NA, NA, 243.65, NA, NA, NA, 12.54))
dt
# A tibble: 7 x 4
Order OrderSteps InputQuantity OutputQuantity
<dbl> <int> <dbl> <dbl>
1 1 1 300.56 NA
2 1 2 NA NA
3 1 3 NA 243.65
4 2 1 65.89 NA
5 2 2 NA NA
6 2 3 NA NA
7 2 4 NA 12.54
Second, convert the data frame from wide format to long format
dt2 <- dt %>%
gather(Type, Value, InputQuantity:OutputQuantity) %>%
arrange(Order, OrderSteps)
dt2
# A tibble: 14 x 4
Order OrderSteps Type Value
<dbl> <int> <chr> <dbl>
1 1 1 InputQuantity 300.56
2 1 1 OutputQuantity NA
3 1 2 InputQuantity NA
4 1 2 OutputQuantity NA
5 1 3 InputQuantity NA
6 1 3 OutputQuantity 243.65
7 2 1 InputQuantity 65.89
8 2 1 OutputQuantity NA
9 2 2 InputQuantity NA
10 2 2 OutputQuantity NA
11 2 3 InputQuantity NA
12 2 3 OutputQuantity NA
13 2 4 InputQuantity NA
14 2 4 OutputQuantity 12.54
Third, filter the data frame and then impute the NA
.
dt3 <- dt2 %>%
filter(Type %in% "OutputQuantity" | !is.na(Value)) %>%
mutate(Value = na.interpolation(Value))
dt3
# A tibble: 9 x 4
Order OrderSteps Type Value
<dbl> <int> <chr> <dbl>
1 1 1 InputQuantity 300.5600
2 1 1 OutputQuantity 281.5900
3 1 2 OutputQuantity 262.6200
4 1 3 OutputQuantity 243.6500
5 2 1 InputQuantity 65.8900
6 2 1 OutputQuantity 52.5525
7 2 2 OutputQuantity 39.2150
8 2 3 OutputQuantity 25.8775
9 2 4 OutputQuantity 12.5400
Finally, merge dt2
and dt3
, and then fill the NA
with the previous record. After that, convert the data frame back to wide format.
dt4 <- dt2 %>%
left_join(dt3, by = c("Order", "OrderSteps", "Type")) %>%
fill(Value.y) %>%
select(-Value.x) %>%
spread(Type, Value.y)
dt4
# A tibble: 7 x 4
Order OrderSteps InputQuantity OutputQuantity
* <dbl> <int> <dbl> <dbl>
1 1 1 300.5600 281.5900
2 1 2 281.5900 262.6200
3 1 3 262.6200 243.6500
4 2 1 65.8900 52.5525
5 2 2 52.5525 39.2150
6 2 3 39.2150 25.8775
7 2 4 25.8775 12.5400
dt4
is the final output.