1

I have data similar to these.

v1 <- c("Fail", 20, 30, "Out", NA, 32, 33, 10)
v2 <- c(10, NA, NA, "Out", "Fail", 34, 35, 30)
df <- data.frame(v1,v2)

I need to transform this data frame as well. So that the 'words' or NA are the information immediately preceding, or if there is no previous information, I need to pick up the information later.

enter image description here

How can I actually do this using modern programming in R? I'm doing something like this, according to this link.

df <- df %>% mutate(v11 = ifelse(v1 %in% "Fail", lag(),
                     ifelse(v1 %in% "Out", lag()),
                     ifelse(is.na(v1) %in% lag(), v1)))
bbiasi
  • 1,549
  • 2
  • 15
  • 31
  • 1
    why is `v1[5]` which is `NA` becomes 32 instead of 30? – Onyambu Aug 14 '18 at 01:00
  • 1
    Possible duplicate of [Replacing NAs with latest non-NA value](https://stackoverflow.com/questions/7735647/replacing-nas-with-latest-non-na-value) – Mike H. Aug 14 '18 at 01:21
  • @MikeH. It seems to be, but it's not 100% the same. Even here came good and new answers. Even have words on the `df`. – bbiasi Aug 14 '18 at 01:28

4 Answers4

2

Solution from zoo ,na.locf

df[which(df=="Fail" | df=='Out')]='NA'
zoo::na.locf(zoo::na.locf(df),fromLast=T)
  v1 v2
1 20 10
2 20 10
3 30 10
4 30 10
5 30 10
6 32 34
7 33 35
8 10 30
BENY
  • 317,841
  • 20
  • 164
  • 234
2

You can use tidyverse:

library(tidyverse)
df%>%
   replace(array(grepl("\\D",as.matrix(df)),dim(df)),NA)%>%
   mutate_all(~as.numeric(as.character(.x)))%>%
   fill(v1:v2,.direction ="down")%>%
   fill(v1:v2,.direction = "up")
  v1 v2
1 20 10
2 20 10
3 30 10
4 30 10
5 30 10
6 32 34
7 33 35
8 10 30
Onyambu
  • 67,392
  • 3
  • 24
  • 53
2

Here is an option with fill

library(tidyverse)
df %>%
     mutate_all(funs(as.numeric(as.character(.)))) %>% 
     fill(v1, v2) %>%
     fill(v1, .direction = 'up')
#   v1 v2
#1 20 10
#2 20 10
#3 30 10
#4 30 10
#5 30 10
#6 32 34
#7 33 35
#8 10 30
akrun
  • 874,273
  • 37
  • 540
  • 662
0

First convert the non-numeric strings to NA using read.table giving df0 and then use na.approx. This gives a matrix. If you want a data frame use as.data.frame on the result.

library(zoo)

df0 <- read.table(text = paste(df$v1, df$v2), na.strings = c("NA", "Out", "Fail"))
na.approx(df0, method = "constant", rule = 2)

giving:

     V1 V2
[1,] 20 10
[2,] 20 10
[3,] 30 10
[4,] 30 10
[5,] 30 10
[6,] 32 34
[7,] 33 35
[8,] 10 30

If desired, we could express this using magrittr like this:

library(matrittr)
library(zoo)

df %$%
  paste(v1, v2) %>%
  read.table(text = ., na.strings = c("NA", "Out", "Fail")) %>%
  na.approx(method = "constant", rule = 2)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341