I am curious why an ifelse()
statement within a call to dplyr::mutate()
only seems to apply to the first row of my data frame. This returns a single value, which is recycled down the entire column. Since the expressions evaluated in either case of the ifelse()
are only valid in the context of my data frame, I would expect the condition check and resulting expression evaluations to be performed on the columns as a whole, not just their first elements.
Here's an example: I have a variable defined outside the data frame called checkVar
. Depending on the value of checkVar
, I want to add differnt values to my data frame in a new column, z
, that are computed as a function of existing columns.
If I do
checkVar <- 1
df <- data.frame( x=11:15, y=1:5 ) %>%
dplyr::mutate( z=ifelse(checkVar == 1, x/y, x-y) )
df
it returns
x y z
1 11 1 11
2 12 2 11
3 13 3 11
4 14 4 11
5 15 5 11
Instead of z being the quotient of x and y for each row, all rows are populated with the quotient of x and y from the first row of the data frame.
However, if I specify rowwise()
, I get the result I want:
df <- df %>%
dplyr::rowwise() %>%
dplyr::mutate( z=ifelse(checkVar == 1, x/y, x-y) ) %>%
dplyr::ungroup()
df
returns
# A tibble: 5 x 3
x y z
<int> <int> <dbl>
1 11 1 11.000000
2 12 2 6.000000
3 13 3 4.333333
4 14 4 3.500000
5 15 5 3.000000
Why do I have to explicitly specify rowwise()
when x
and y
are only defined as columns of my data frame?