0

Hi all I'm new to R and would love your help. I have a data frame where I would like to recode some values. Here's an example data frame:

df <- data.frame(age = sample(100, size = 6),
                 gender = c("boy", "girl"))
print(x)
      age gender
    1  58    boy
    2  41   girl
    3  31    boy
    4  96   girl
    5  93    boy
    6  60   girl

Let's say I want to recode boy to man and girl to woman in a new column called new.gender . I tried using the ifelse function (to no avail):

df$new.gender <- NA
ifelse(x$gender == "boy", x$new.gender <- "man", x$new.gender <- "woman")
print(x)
  age gender new.gender
1  96    boy      woman
2  46   girl      woman
3  68    boy      woman
4   6   girl      woman
5  26    boy      woman
6  55   girl      woman

After some thinking, I changed the syntax a bit and got it to work:

x$new.gender <- NA
x$new.gender <- ifelse(x$gender == "boy", "man", "woman")
print(x)
  age gender new.gender
1  96    boy        man
2  46   girl      woman
3  68    boy        man
4   6   girl      woman
5  26    boy        man
6  55   girl      woman

Can someone help me understand why my first attempt resulted in all values changing to woman, while my second attempt worked? Thanks!

thelatemail
  • 91,185
  • 12
  • 128
  • 188
jason.f
  • 67
  • 1
  • 6
  • 1
    Likely because `x$new.gender <- "woman"` is the last expression that got evaluated. – Rich Scriven Sep 06 '18 at 00:15
  • Related discussion of how `<-` works inside calls, and how it is different to `=` https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-in-r – thelatemail Sep 06 '18 at 00:32

1 Answers1

1

ifelse(test, yes, no) returns a vector equal to the length of test.

in your case, the assignment of a full column was executed for nrow(x) times. The final result depends on the last test (x$gender == "boy", false), which explains why you see a queue of women in that column.

TC Zhang
  • 2,757
  • 1
  • 13
  • 19