1

I have the following data:

dates       CLOSE    USED
20110309    58,1483 Historico
NA          58,1483 
NA          57,0001 
20110310    34,999  Historico
NA          57,1272 
NA          55,9756 
20110311    59,898  Historico
NA          56,3055 
NA          55,1518 

I want to replicate the value of the dates and the used in the following lines if they are empty or NA, as in the below table:

dates       CLOSE      USED
20110309    58,1483 Historico
20110309    57,0001 Historico
20110310    57,1272 Historico
20110310    55,9756 Historico
20110311    56,3055 Historico
20110311    55,1518 Historico

I use a for loop for doing it:

  for (j in 1:dim(data)[1]){

    if(is.na(data$dates[j]) || (data$USED=="")){
      data$dates[j]=data$dates[j-1]
      data$USED[j]=data$USED[j-1]
    }

It's a bit slow because of the loop, as my files are large, so I was wondering if it's a faster way to deploy it.

I've also tried to use which function, but it doesn't work properly:

data$dates[which(is.na(data$dates))]=data$dates[which(is.na(data$dates))-1]

It only work for one empty line, as it's shown below:

dates       CLOSE    USED
20110309    58,1483 Historico
20110309    58,1483 Historico
NA          57,0001 
20110310    34,999  Historico
20110310    57,1272 Historico
NA          55,9756 
20110311    59,898  Historico
20110311    56,3055 Historico
NA          55,1518 

If anyone knows a faster way to do it...

Thanks!

Ahinoa
  • 75
  • 7

2 Answers2

1

You can do it using data.table and zoo. This can be acheived using na.locf from zoo:-

library(data.table)
library(zoo)
setDT(data)
data[, dates := na.locf(dates)]
data[, USED := na.locf(USED)]
sm925
  • 2,648
  • 1
  • 16
  • 28
0

dplyr::lag() should be able to do what you want.