2

I know that it is possible to do the following:

df$V1[df$V1 == "Y"] <- 1

to rename any value that is equal to "Y" to be changed to 1. However what If I had values that equal "N" that I want to change to 0?

I have tried doing this:

df$V1[df$V1 == c("Y","N")] <- c(1,0)

but I get a warning

longer object is not a multiple of shorter object

which leads to not all values that match case definition to be converted.

what would be the way to do this?

brucezepplin
  • 9,202
  • 26
  • 76
  • 129

3 Answers3

4

Here is why your code didn't work

df$V1[df$V1 == c("Y","N")] <- c(1,0)

is asking R to match values of V1 where it is a vector of 2 value of "Y" and "N" if you want to get either "Y" or "N", you can do

df$V1[df$V1 %in% c("Y", "N")] <- c(1,0

In your case, I might consider using factors in R. Factors are categories. Levels in the factors is like a summary of the vector, telling you what unique values/factors are in the vector. The function levels(x) gives you the level of vector x

So if you have a vector that looks like this : x<-c('Male', 'Male','Male','Female','Female','Female')

you will see that it is made out of 2 repeated items 'Male', 'Female'

if you run levels(x)

you will get

[1] Male Male Male Female Female Female
levels: Male Female

and when you run levels(x) <- c('M','F')

you'll get

[1] M M M F F F
levels: M F

For instance, if you have a given the following dataframe :

V1 <- rep(letters[1],10, letters[4],8) ## first column consist of 10 'a' and 8'd'
V2 <- 1:18
df <- data.frame(V1, V2) 

levels(df$V1) <- c('A','D') # replace all 'a' with 'A' and all 'd' with 'D'    

I think this is the platonic way to do replacement.

Another way if you will want to only replace specific value, I will suggest you to write a function that works like a hash, and do apply over the dataframe.

This technique is used in ggplot to replace labels in facet_wrap http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/

But this means you will end up writing more lines of code although I think it will appear nicer

biobirdman
  • 4,060
  • 1
  • 17
  • 15
  • ahh yes, the %in% function is key I see. That is so simple. I see what you mean about asking to match a vector. could you explain a bit more how that last line in the factors bit works? I understand that I have levels in my data which I want to change each one, but I don't see how the last line changes lower case 'a' to upper case 'A' – brucezepplin Feb 18 '15 at 16:00
  • 1
    @brucezepplin Done! Hope this will help you better understand – biobirdman Feb 19 '15 at 00:53
  • thanks so much. the levels() output makes sense in renaming now. thank you. – brucezepplin Feb 19 '15 at 09:09
1

chartr could be useful:

x <- c("Y","N","N","X")

chartr("YN", "10", x)
#[1] "1" "0" "0" "X"

Of course this only works if you only have one-character strings.

Roland
  • 127,288
  • 10
  • 191
  • 288
0

Best to write reproducible code to begin with. One answer is:

df <- data.frame( V1 = c("Y","Y","N"))
df$V1 <- ifelse(df$V1 == "Y", 1,  ifelse(df$V1 == "N", 0, "X") )
puslet88
  • 1,288
  • 15
  • 25