0

I am trying to loop through the values within a categorical variable and assign a number based on whether the value is "yes" or "no"

My data is "train" and the variable is "default" which looks as follow:

default = c("no", "yes", "no"....)

I want to create a separate vector which contains thee number 10 for any value of "yes" and the number 1 for nay value "no."

I tried:

wgts = c()
for (y in 1:nrow(train)) {
  ifelse(train$default[y] == "yes", wgts = append(wgts[y], 10), wgts = append(wgts[y], 1))
  return(wgts)
}

But the resulting vector is turning out to be NULL. How can I fix this?

Jane Miller
  • 153
  • 9
  • Don't use `ifelse` when doing single comparisons, use `if` (and `else`). (1) Don't do assignment within it. (2) Declarative programming, if you intend a single, use `if`. (3) `if` is primitive, much faster. (4) `ifelse` has baggage: https://stackoverflow.com/q/6668963/3358272. – r2evans Mar 30 '21 at 15:06

2 Answers2

0

There is no need for a for loop here, just use:

ifelse(default == "yes", 10,1)

This assumes you only have yes or no in your vector. A short sample:

default <- c("yes","no")

ifelse(default == "yes", 10,1)

If you need some more speed, you can simply subset the vector:

default[default=="yes"] <- 10
default[default=="no"]<-1
default <- as.numeric(default)

This way you overwrite your default vector.

RBeginner
  • 244
  • 3
  • 7
0

An option with case_when

library(dplyr)
case_when(default == 'yes' ~ 10, TRUE ~ 1)
akrun
  • 874,273
  • 37
  • 540
  • 662