8

I had a dataframe where I recoded several columns so that 999 was set to NA

dfB <-dfA %>%
  mutate(adhere = if_else(adhere==999, as.numeric(NA), adhere)) %>%
  mutate(engage = if_else(engage==999, as.numeric(NA), engage)) %>%
  mutate(quality = if_else(quality==999, as.numeric(NA), quality)) %>%
  mutate(undrstnd = if_else(undrstnd==999, as.numeric(NA), undrstnd)) %>%
  mutate(sesspart = if_else(sesspart==999, as.numeric(NA), sesspart)) %>%
  mutate(attended = if_else(attended>=9, as.integer(NA), attended))

I want to use mutate_at() and a range of columns and recode() instead of if_else(), but I am stuck on how to give it the condition. I think something like 999 = NA based on some mutate_all examples -- but I also need the NA to match the type of .x and I am unsure how to get it to be type sensitive

I tried:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))
z <- y %>%
    mutate_at( vars(y1:y2), funs(recode(.,`999` = as.numeric(NA))))

But I get a warning "Unreplaced values treated as NA as .x is not compatible. Please specify replacements exhaustively or supply .default " and I can see that it worded for the numeric column, but not for the integer column y2"

> z
  y1 y2    y3
1  1 NA  TRUE
2  2 NA  TRUE
3 NA NA FALSE
4  3 NA FALSE
5  4 NA  TRUE
D. Bontempo
  • 176
  • 1
  • 1
  • 9

5 Answers5

9

I think it is related the column type. I added mutate_if to convert all integer columns to numeric, and then set the recode value to be NA_real_. It seems working.

library(dplyr)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), funs(recode(.,`999` = NA_real_)))
z
#   y1 y2    y3
# 1  1  1  TRUE
# 2  2  2  TRUE
# 3 NA NA FALSE
# 4  3  3 FALSE
# 5  4  4  TRUE
www
  • 38,575
  • 12
  • 48
  • 84
  • thanks, www. this does solve the problem of the warning. it forces everything to real and avoids the wrong kind of NA for the previously integer columns. I had considered this. I have some other parts of the code that counts on those columns being integer and I will need to go reset them to integer after recoding. I was hoping for a way to make the NA value responsive to the kind of number in each column. – D. Bontempo Nov 28 '17 at 04:32
7

I'm having trouble understanding exactly what you want to accomplish, so let me know if this isn't quite it.


library(dplyr)

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

y

#>    y1  y2    y3
#> 1   1   1  TRUE
#> 2   2   2  TRUE
#> 3 999 999 FALSE
#> 4   3   3 FALSE
#> 5   4   4  TRUE

z <- y %>%
  mutate_at(vars(y1:y2), ~ifelse(. == 999, NA, .))

z

#>   y1 y2    y3
#> 1  1  1  TRUE
#> 2  2  2  TRUE
#> 3 NA NA FALSE
#> 4  3  3 FALSE
#> 5  4  4  TRUE
ardaar
  • 1,164
  • 9
  • 19
  • thanks evertr. This does solve the problem. It retains the if_else() instead of using recode() - but I can live with that. I can use the "." as you suggest to avoid changing to numeric. I am not clear why it does not complain that the NA for true is the wrong type. In my original code I had to use as.numeric(NA) or as.integer(NA) to avoid errors. DO you know why it does not give an error here? – D. Bontempo Nov 28 '17 at 04:38
  • ahh, OK. I see that you used ifelse() which does not check the type the same way that if_else() does. Do you know how this could be done with if_else() without casting the whole data frame as real? – D. Bontempo Nov 28 '17 at 04:43
  • @D.Bontempo you could use `mutate_if(is.numeric, ...)`, which also matches integers, such that you don't have to select all the variables (like in the solution from @www, but without converting anything). @everetr I'd recommend to remove that `as.numeric` from your solution, because there is no need for type-convertion. Then it would be +1-worthy ;-) – Tino Nov 28 '17 at 05:50
  • @Tino, See my comment below the code. I already said one can omit `as.numeric`, if desired. I didn't know if @D. Bontempo wanted conversion or not. – ardaar Nov 28 '17 at 14:20
  • @D.Bontempo See my comment below the code. You can omit `as.numeric` to avoid converting `y$y2` from `integer` to `numeric`. – ardaar Nov 28 '17 at 15:51
  • @everetr but there is no need for converting to numeric at all, or do I miss something? Therefore, I'd omit `as.numeric` at all... – Tino Nov 28 '17 at 17:15
  • @Tino Omitted. Ellipsis averted. – ardaar Nov 28 '17 at 20:39
  • @everetr - thanks I will not change to numeric. The key bit here really was that ifelse() does not police types and if_else does(). I solved the problem by adopting ifelse. I think if I had to use if_else, I would write a function to return me a NA value that matched. – D. Bontempo Nov 29 '17 at 16:58
7

Now that funs has been depreciated in dplyr, here's the new way to go:

z <- y %>%
  mutate_if(is.integer, as.numeric) %>%
  mutate_at(vars(y1:y2), list(~recode(.,`999` = NA_real_)))

Replace funs with list and insert a ~ before recode.

bcarothers
  • 824
  • 8
  • 19
7

Currently, based on dplyr documentation:

across() supersedes the family of "scoped variants" like summarise_at(), summarise_if(), and summarise_all().

So, using mutate and across instead is now recommended.

Like Chris LeBoa said, if you only want to convert an annoying value to NA, the function na_if() is probably the best choice:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T))

y
   y1  y2    y3
1   1   1  TRUE
2   2   2  TRUE
3 999 999 FALSE
4   3   3 FALSE
5   4   4  TRUE
 
z <- y %>%
    mutate(across(
        y1:y2,
        ~na_if(., 999)
    ))

z
  y1 y2    y3
1  1  1  TRUE
2  2  2  TRUE
3 NA NA FALSE
4  3  3 FALSE
5  4  4  TRUE

Similarly, if you really want to recode values in multiple columns, you can follow the example from bcarothers:

df1 <- tibble(Q7_1=1:5,
              Q7_1_TEXT=c("let's","see","grogu","this","week"),
              Q8_1=6:10,
              Q8_1_TEXT=rep("grogu",5),
              Q8_2=11:15,
              Q8_2_TEXT=c("grogu","is","the","absolute","best"))

df2 <- df1 %>%
    mutate(across(
        starts_with("Q8") & ends_with("TEXT"),
        ~recode(., "grogu"="mando")
    ))
teppo
  • 542
  • 8
  • 11
1

If you are trying to recode something to an NA the na_if() function should also work.