-1

I would like to use ifelse() inside a dplyr::mutate() call, but I do not necessarily know the column name. Although, this column will always be the first column, so I know its position. Is there a way I can do this?

Reprex using column name:

library(dplyr, warn.conflicts = FALSE)

tibble(x = 1:10, y = rnorm(10)) %>% 
  mutate(z = ifelse(x < 4, "a", "b"))
#> # A tibble: 10 x 3
#>        x       y z    
#>    <int>   <dbl> <chr>
#>  1     1  1.03   a    
#>  2     2 -0.600  a    
#>  3     3  0.0364 a    
#>  4     4  0.986  b    
#>  5     5 -0.815  b    
#>  6     6  0.166  b    
#>  7     7 -0.607  b    
#>  8     8 -0.719  b    
#>  9     9  0.799  b    
#> 10    10 -0.947  b

Created on 2020-03-30 by the reprex package (v0.3.0)

Now I need to do the same, by using the column position (1) instead. Like: ifelse(**position 1** < 4, "a", "b").

This has to be inside a dplyr::mutate call.

FMM
  • 1,857
  • 1
  • 15
  • 38

2 Answers2

2

dplyr >= 1.0

Since 1.0, dplyr provides the cur_data() function which is better than abusing the . from the pipe operator, and ensures this solution accurately works with grouped data.

tibble(x = 1:10, y = rnorm(10)) %>% 
    mutate(z = ifelse(cur_data()[[1]] < 4, "a", "b"))

Original answer

Reference column at index i using .[[i]].

tibble(x = 1:10, y = rnorm(10)) %>% 
  mutate(z = ifelse(.[[1]] < 4, "a", "b"))
#> # A tibble: 10 x 3
#>        x       y z    
#>    <int>   <dbl> <chr>
#>  1     1  0.255  a    
#>  2     2 -0.0805 a    
#>  3     3 -0.553  a    
#>  4     4 -0.492  b    
#>  5     5 -1.80   b    
#>  6     6  0.199  b    
#>  7     7 -0.397  b    
#>  8     8  1.06   b    
#>  9     9  1.72   b    
#> 10    10 -0.248  b
caldwellst
  • 5,719
  • 6
  • 22
0

You can reference columns by their index instead of name, using brackets directly following the dataframe.

E.g. ifelse(df[1] < 4, "a", "b")

mhovd
  • 3,724
  • 2
  • 21
  • 47