0

receiving the following error message: Error in mutate_impl(.data, dots) : Evaluation error: argument "no" is missing, with no default.

 mutate(x,perfLev= ifelse(SS< 1438, "Below Basic",
                   ifelse(SS>= 1439 & SS <= 1499, "Basic",
                   ifelse(SS >= 1500 & SS <= 1545, "Proficient",
                   ifelse(SS >= 1546, "Advanced")))))
Obsidian Age
  • 41,205
  • 10
  • 48
  • 71
Omer
  • 25
  • 1
  • 1
  • 2

3 Answers3

3

Using comments by Make212 and Renu, here's one option for fixing it:

library(dplyr)
mutate(x,
       perfLev = case_when(
         SS <  1438              ~ "Below Basic",
         SS >= 1439 & SS <= 1499 ~ "Basic",
         SS >= 1500 & SS <= 1545 ~ "Proficient",
         SS >= 1546              ~ "Advanced",
         TRUE                    ~ "huh?"
       ) )

I added a "default" (TRUE), which is generally good (explicit code). Note that if you do not include the TRUE, then it would get an NA value, in case that's what you want. I can see it happening here if any of the following are true:

  • is.na(SS)
  • SS >= 1438 & SS < 1439
  • SS > 1499 & SS < 1500
  • SS > 1545 & SS < 1546

You may not need it if NA is acceptable and you are guaranteed of SS's integrality.

This code is equivalent to a slight fix to your code:

mutate(x,
       perfLev = 
         ifelse(SS < 1438, "Below Basic",
                ifelse(SS >= 1439 & SS <= 1499, "Basic",
                       ifelse(SS >= 1500 & SS <= 1545, "Proficient",
                              ifelse(SS >= 1546, "Advanced", "huh?"))))
       )

Indentation for style/clarity only.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • (I really wish R supported python's chaining of comparisons, such as `1499 < SS < 1500` ... I understand why it won't happen, but still ...) – r2evans Mar 15 '18 at 20:59
  • If you are using `case_when`, then you can also use `between` rather than `a < b < c` – MrFlick Mar 15 '18 at 21:07
  • 1
    @MrFlick, except that they are both-closed, meaning `between` gives you `a <= b <= c`. In his examples, you are right, I should have included that, but I often need open-ends, too. – r2evans Mar 15 '18 at 21:14
  • 3
    I just ran a quick test with 1e5 random `SS`, and `dplyr::case_when` was ~2x faster than `dplyr::if_else`, and it was ~5x faster than `base::ifelse`. In this example. With four nested conditionals. I can find performance-parity when `n` is around 200, so if your data is smaller then performance is likely not a consideration; if your data is larger, then you will be faster with either of `dplyr`'s functions. – r2evans Mar 15 '18 at 21:21
1

case_when is used to vectorize multiple if/else statements

library(dplyr)
mutate(x,perfLev= case_when(
                   SS < 1438 ~ "Below Basic",
                   SS >= 1439 & SS <= 1499 ~ "Basic",
                   SS >= 1500 & SS <= 1545 ~ "Proficient",
                   SS >= 1546 ~ "Advanced"))
Mako212
  • 6,787
  • 1
  • 18
  • 37
  • Poll: any reason you prefer `require` over `library`? (I've found only one use-case where I find its behavior desirable, otherwise I agree with Hadley's discussion [here](http://r-pkgs.had.co.nz/namespace.html), search down for *"There are four functions ..."*.) – r2evans Mar 15 '18 at 20:53
  • @r2evans I remember referencing [this](https://stackoverflow.com/questions/5595512/what-is-the-difference-between-require-and-library) post and deciding it doesn't really matter unless you're writing packaged functions. The main difference is that `library` throws an error if a package isn't found, where as `require` returns `TRUE/FALSE`. I've just defaulted to `require` for a long time, and I've never had any issues, but I can see the argument for `library` since it throws an explicit error. – Mako212 Mar 15 '18 at 21:00
  • I'm familiar with that post and the argument, and when I was learning R, I started using `require`. I promptly stopped when I jumped to another computer, sourced a `quux.R` file, and it skimmed past that line (which silently returned `FALSE` without stopping) and failed later. I had to bring in the file line-by-line, which is where I found the mistake. Another common discussion/debate is Yihui's [library-vs-require](https://yihui.name/en/2014/07/library-vs-require/). Thanks, this was just an unofficial poll :-) – r2evans Mar 15 '18 at 21:03
  • @r2evans thanks for the reading material, I might just consider switching to `library` – Mako212 Mar 15 '18 at 21:08
  • Thanks so much for your time ! – Omer Apr 09 '18 at 16:40
1

Though OP has mentioned about problem with ifelse use in mutate but I thought to mention that in such scenario cut provides better option.

One can simply write as:

library(dplyr)
x %>% 
mutate(perfLev = cut(SS, breaks = c(0, 1438, 1499, 1545, +Inf), 
    labels = c("Below Basic", "Basic", "Proficient", "Advanced"))) 

#OR

x$perfLev <- cut(SS, breaks = c(0, 1438, 1499, 1545, +Inf), 
    labels = c("Below Basic", "Basic", "Proficient", "Advanced"))

The logic to match breaks with labels can be simplified by writing in down in a tabular format and use it as hint. The option for the above case could be as:

0
1438    --    "Below Basic"
1499    --    "Basic"
1545    --    "Proficient"
+inf    --    "Advanced"
MKR
  • 19,739
  • 4
  • 23
  • 33