0

I have a dataframe that looks something like this:

df <- data.frame('Home.Team'=c("Omaha South", "Millard North", "Elkhorn","Elkhorn"),                      
                 'Winner'=c("Omaha South", "Millard North", "Elkhorn","Elkhorn"),
                 'Won By'=c(8,22,4,30),
                 'Away Class'=c("TRUE", "FALSE", "TRUE", "FALSE"))

I'm trying to create a new column/variable using conditional if_else from dplyr. This had worked for me in the past but for some reason it is now giving me an error. Below is the r code and error:

df$'Pre Score' <- 
        if_else(df$`Away Class`=="FALSE", 
        if_else(df$Home.Team==df$Winner, .8 + (df$`Won By`/100) -1, -1.2 - (df$`Won By`/100) -1), 
        if_else(df$Home.Team==df$Winner, .8 + (df$`Won By`/100), -1.2 - (df$`Won By`/100)))

Error: true must be length 4 (length of condition) or one, not 0

I've read through multiple SO discussions (example, example, example) related to this but haven't been able to translate it into a solution for my problem. It seem to have something to do with the "if true" portion of the code. Apparently it thinks this is a length of one whereas I want it to be a length of 4, or to work for all rows. Tried replacing if_else with case_when but wasn't able to succeed there either.

Jeff Swanson
  • 45
  • 1
  • 8
  • Not directly related, but why do you have true & false as characters and not boolean values? For debugging ifelse statements, especially ones nested like this, it would be a good idea to pick them apart one by one to see which ones do or don't run. Also, how did you try `case_when`? – camille Mar 01 '20 at 18:37

1 Answers1

2

This should have been a comment, but it came out too confusing:

You're missing periods in your variable names - data.frame automatically adds those when you create a df with variable names that have spaces:

if_else(df$`Away.Class`=="FALSE", #Away.Class instead of `Away Class`
             if_else(df$Home.Team==df$Winner, .8 + (df$`Won.By`/100) -1, -1.2 - (df$`Won.By`/100) -1), # Won.By instead of `Won By`
             if_else(df$Home.Team==df$Winner, .8 + (df$`Won.By`/100), -1.2 - (df$`Won.By`/100))) # ditto
[1] 0.88 0.02 0.84 0.10

Here's why your code results in an error: when you run .8 + (df$Won By/100) -1, the result is NULL, because the column doesn't exist - so the list TRUE/FALSE results is length zero. ifelse needs this list to either be the same length as your condition (which is four, in which case each TRUE case will get it appropriate data), or 1 (in which case all TRUE results will get the same output).

iod
  • 7,412
  • 2
  • 17
  • 36
  • Thanks @iod. Will this solve the issue above or is that and additional error? – Jeff Swanson Mar 01 '20 at 17:57
  • As I show in my example - if I run the code with the missing periods, I get the expected answer (`[1] 0.88 0.02 0.84 0.10`) – iod Mar 01 '20 at 18:22