4

I know this is more of a general basic question. But still it's kind of confusing to me. The "problems" are the ~ and the . in R. They just pop up everywhere and I don't know exactly what they mean in every context. There is for example this code, where I want to encode each -1 or -9 to be NA:

df_clean = dplyr::mutate_all(df, ~ifelse(. %in% c(-1, -9), NA, .))

So df in this case is a data.frame with several columns. Some of them containing many NAs. But why the ~ in front of the ifelse? And the first . helps to iterate over each row?

Sorry for the confusion. But maybe someone can explain this with some easy words;)

Robin Kohrs
  • 655
  • 7
  • 17

1 Answers1

9

The . here refers to values in the column whereas ~ is a formula style syntax to represent the function. It is a style of coding than anything else.

This can also be represented using an anonymous function as in base R

dplyr::mutate_all(df, function(x) ifelse(x %in% c(-1, -9), NA, x))

which is same as using lapply in base R :

lapply(df, function(x) ifelse(x %in% c(-1, -9), NA, x))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213