5

I have a data frame which has a column that is as follows:

[1] i5olv
[2]
[3] udp3o
[4]
[5]
[6]
[7] uem5i
[8] b0047
[9]
[10]

Notice that the elements have no specific order and there may be a variable number of empty rows between non-empty elements. I would like the column to look like this instead:

[1] i5olv
[2] i5olv
[3] udp3o
[4] udp3o
[5] udp3o
[6] udp3o
[7] uem5i
[8] b0047
[9] b0047
[10] b0047

How can I do this in a vectorized way? I can do this using a for-loop which caches the last non-empty value but this is slow.

ruser45381
  • 205
  • 1
  • 9
  • 5
    You could convert empty cells to `NA`s and use `na.locf` from the `zoo` package. – David Arenburg May 05 '15 at 15:15
  • 1
    Fyi, it's helpful to include easily reproducible data with your question. See the "data" part of akrun's answer, for example. You can get that sort of code using the `dput` function on your data frame. – Frank May 05 '15 at 15:30
  • 1
    @DavidArenburg Nice one! I didn't know about the na.locf function. I can see from the results on Google and StackOverflow that it's exactly what I needed. Related StackOverflow questions: https://stackoverflow.com/questions/19838735/how-to-na-locf-in-r-without-using-additional-packages https://stackoverflow.com/questions/1782704/propagating-data-within-a-vector/1783275#1783275 – ruser45381 May 05 '15 at 15:31

1 Answers1

2

An option using data.table

library(data.table)
setDT(df1)[, Col1:=Col1[1L] ,cumsum(Col1!='')]

 #    Col1
 #1: i5olv
 #2: i5olv
 #3: udp3o
 #4: udp3o
 #5: udp3o
 #6: udp3o
 #7: uem5i
 #8: b0047
 #9: b0047
 #10: b0047

data

 df1 <- structure(list(Col1 = c("i5olv", "", "udp3o", "", "", "", 
 "uem5i", 
 "b0047", "", "")), .Names = "Col1", row.names = c(NA, -10L),
  class = "data.frame")
akrun
  • 874,273
  • 37
  • 540
  • 662