2
library(tidyverse)
df <- tibble(col1 = c(5, 2), col2 = c(6, 4), col3 = c(9, 9))
# # A tibble: 2 x 3
#    col1  col2  col3
#   <dbl> <dbl> <dbl>
# 1     5     6     9
# 2     2     4     9


df.cha <- df %>% mutate(col4 = ifelse(apply(.[, 1:3], 1, sd) > 3,
                                      "True",
                                      "False"))
df.cha$col4
#[1] "False" "True" 

The code above works fine. Column 4 is a column of characters as I'd expect. However, I can add one extra condition to my ifelse statement, that being & .[, 3] > 0, and all of a sudden R is creating matrices for column 4 instead of leaving it as a character vector like I want. See below.

Why is this?

df.mat <- df %>% mutate(col4 = ifelse(apply(.[, 1:3], 1, sd) > 3 & 
                                        .[, 3] > 0,  # I only added this
                                      "True",
                                      "False"))
df.mat$col4
#      col3   
# [1,] "False"
# [2,] "True" 
Display name
  • 4,153
  • 5
  • 27
  • 75
  • 2
    Change the `.[,3]` to `.[[3]]` It is similar to the logic for the earlier question you posted – akrun Apr 25 '19 at 16:25
  • 1
    Here, you are comparing a vector to a sngle column tibble. Check here `apply(df[1:3], 1, sd) > 3 & df[, 3] > 0` The `apply` part returns a logical vector, and it is compared with a one column `tibble`. It is similar behavior when you apply to a data.frame i.e. `is.na(df1)` - returns matrix – akrun Apr 25 '19 at 16:28
  • But `df.cha$col4` is returned 'properly' as a character vector in my first example even though I use `.[, 1:3]` in my first example, and `.[, 1:3]` is a tibble, not a vector. Why doesn't the first example `df.cha` create a col4 matrix? – Display name Apr 25 '19 at 16:30
  • 1
    But, it is going through `apply`, which converts to a `matrix` and loses the `tbl_df` frame. With the second expression, it is fresh without any changes – akrun Apr 25 '19 at 16:35

1 Answers1

5

apply() converts your input to a matrix first, then when you run sd() across that matrix you get a simple vector.

But when you do .[,3] with a tibble, you get a tibble back. Selecting one column does not simplify to a vector like it does with data.frames. Note that you would get different behavior if df were a data.frame rather than tibble.

So the "problem" isn't with ifelse() really. It's the fact that you are doing comparisons on a tibble so the shape is preserved rather than simplified to a vector. And it just so happens that tibble + greater-than/less-than will return matrix by design.

MrFlick
  • 195,160
  • 17
  • 277
  • 295