-1

Very simple and common task:
I need to FILL DOWN in data.table (similar to autofill function in MS Excel) so that

library(data.table)
DT <- fread(
"Paul 32
NA    45
NA    56
John  1
NA    5
George 88
NA    112")

becomes

Paul 32
Paul 45
Paul 56
John 1
John 5
George 88
George 112

Thank you!

IVIM
  • 2,167
  • 1
  • 15
  • 41
  • 1
    See if [this](https://stackoverflow.com/questions/7735647/replacing-nas-with-latest-non-na-value) helps. – Rui Barradas Apr 30 '19 at 14:39
  • Brilliant idea - to treat data as a timeseries! And then indeed there are various approaches to impute NA in ts! – IVIM May 01 '19 at 17:51
  • Originally, this question had two example to resolve. The second one was later removed - to simply the question. It is shown below. The answer to it was given below. EXAMPLE 2: DT <- fread( "Paul, 1 Paul, 2 Paul, NA Paul, NA John, 100 John, 110 John, NA John, NA") becomes Paul 1 Paul 2 Paul 3 Paul 4 John 100 John 110 John 120 John 130 – IVIM May 07 '19 at 14:52

2 Answers2

2

Yes the best way to do this is to use @Rui Barradas idea of the zoo package. You can simply do it in one line of code with the na.locf function.

library(zoo)
DT[, V1:=na.locf(V1)]

Replace the V1 with whatever you name your column after reading in the data with fread. Good luck!

Jason Johnson
  • 451
  • 3
  • 7
  • This neatly resolves Example 1. Once I write a code for Example 2, I'll post it here. Thanks! – IVIM May 01 '19 at 17:52
  • Related solution: replacing NA's with `0`: https://stackoverflow.com/questions/7235657/fastest-way-to-replace-nas-in-a-large-data-table: ` for (i in names(DT)) DT[is.na(get(i)), (i):=0]` – IVIM Sep 12 '19 at 19:16
  • Another way is posted here: https://stackoverflow.com/questions/26171958/fill-in-missing-values-by-group-in-data-table - `DT[, filled4 := DT[!is.na(value)][DT, value, roll = T]]`. But I could not make it to run on my example ;( – IVIM Sep 13 '19 at 21:16
1

For example 2, you can consider using stats::spline for extrapolation as follows:

DT2[is.na(V2), V2 := 
    as.integer(DT2[, spline(.I[!is.na(V2)], V2[!is.na(V2)], xout=.I[is.na(V2)]), by=.(V1)]$y)]

output:

     V1  V2
1: Paul   1
2: Paul   2
3: Paul   3
4: Paul   4
5: John 100
6: John 110
7: John 120
8: John 130

data:

DT2 <- fread(
"Paul, 1
Paul, 2 
Paul, NA 
Paul, NA 
John, 100
John, 110
John, NA
John, NA")
chinsoon12
  • 25,005
  • 4
  • 25
  • 35
  • This is the answer for a more general Autofill solution, which was originally asked in EXAMPLE 2 of this question, and later removed from question, which is DT <- fread( "Paul, 1 Paul, 2 Paul, NA Paul, NA John, 100 John, 110 John, NA John, NA") becomes Paul 1 Paul 2 Paul 3 Paul 4 John 100 John 110 John 120 John 130 – IVIM May 07 '19 at 14:49