0

I'm struggling to solve this apparently simple question in R, but no suceess until now. I have a data.frame with a char variable having some blanks and some non-blank values. I'm trying to complete those blanks with the last non-blank found into the same variable from top-down as in the following example related do variable 'Species' in data.frame 'want' vs 'have'.

If someone could help, I thanks in advance!

set.seed(12346)
foi <- split(iris, iris$Species)
want <- do.call("rbind", lapply(foi, function(x){
  x[1:sample(1:10, 1), ]
}))
row.names(want) <- NULL
want$Species <- as.character(want$Species)

have <- want
have$Species[2:10] <- ""
have$Species[12:16] <- ""
have$Species[18:21] <- ""

head(have, 20)
head(want, 20)
Jlopes
  • 116
  • 5

1 Answers1

0

A simple for loop assuming the first value is non missing:

for(i in which(have$Species=="")) have$Species[i]=have$Species[i-1]

You could split your variable by block of consecutive blank values and fill each block with the first previous non blank value if speed is an issue and your file is huge.

Frostic
  • 680
  • 4
  • 11