Rename multiple field values in one statement

Question

I know that it is possible to do the following:

df$V1[df$V1 == "Y"] <- 1

to rename any value that is equal to "Y" to be changed to 1. However what If I had values that equal "N" that I want to change to 0?

I have tried doing this:

df$V1[df$V1 == c("Y","N")] <- c(1,0)

but I get a warning

longer object is not a multiple of shorter object

which leads to not all values that match case definition to be converted.

what would be the way to do this?

(Or, if it's just "Y" and "N", just `as.numeric(df$V1 == "Y")`) — A5C1D2H2I1M1N2O1R2T1, Feb 18 '15 at 15:39
why don't you try `df$V1[df$V1 == "Y"] <- 1` in a first step and `df$V1[df$V1 == "N"] <- 0` in a second step? — rmuc8, Feb 18 '15 at 15:41
@Markus it's just a preference. I could do it like that of course, but if there is a way of doing it in one command I would much prefer that — brucezepplin, Feb 18 '15 at 15:42
Here's quite awkward solution `x[x %in% c("Y","N")] <- (0:1)[as.numeric(factor(x[x %in% c("Y","N")]))]` (assuming `x` is `df$V1` and that `df$V1` isn't a factor) — David Arenburg, Feb 18 '15 at 16:01

biobirdman · Accepted Answer · 2015-02-19T00:53:18.033

Here is why your code didn't work

df$V1[df$V1 == c("Y","N")] <- c(1,0)

is asking R to match values of V1 where it is a vector of 2 value of "Y" and "N" if you want to get either "Y" or "N", you can do

df$V1[df$V1 %in% c("Y", "N")] <- c(1,0

In your case, I might consider using factors in R. Factors are categories. Levels in the factors is like a summary of the vector, telling you what unique values/factors are in the vector. The function levels(x) gives you the level of vector x

So if you have a vector that looks like this : x<-c('Male', 'Male','Male','Female','Female','Female')

you will see that it is made out of 2 repeated items 'Male', 'Female'

if you run levels(x)

you will get

[1] Male Male Male Female Female Female
levels: Male Female

and when you run levels(x) <- c('M','F')

you'll get

[1] M M M F F F
levels: M F

For instance, if you have a given the following dataframe :

V1 <- rep(letters[1],10, letters[4],8) ## first column consist of 10 'a' and 8'd'
V2 <- 1:18
df <- data.frame(V1, V2) 

levels(df$V1) <- c('A','D') # replace all 'a' with 'A' and all 'd' with 'D'

I think this is the platonic way to do replacement.

Another way if you will want to only replace specific value, I will suggest you to write a function that works like a hash, and do apply over the dataframe.

This technique is used in ggplot to replace labels in facet_wrap http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/

But this means you will end up writing more lines of code although I think it will appear nicer

ahh yes, the %in% function is key I see. That is so simple. I see what you mean about asking to match a vector. could you explain a bit more how that last line in the factors bit works? I understand that I have levels in my data which I want to change each one, but I don't see how the last line changes lower case 'a' to upper case 'A' — brucezepplin, Feb 18 '15 at 16:00
@brucezepplin Done! Hope this will help you better understand — biobirdman, Feb 19 '15 at 00:53
thanks so much. the levels() output makes sense in renaming now. thank you. — brucezepplin, Feb 19 '15 at 09:09

score 1 · Answer 2 · answered Feb 18 '15 at 15:43

1

chartr could be useful:

x <- c("Y","N","N","X")

chartr("YN", "10", x)
#[1] "1" "0" "0" "X"

Of course this only works if you only have one-character strings.

answered Feb 18 '15 at 15:43

Roland

127,288
10
191
288

score 0 · Answer 3 · answered Feb 18 '15 at 15:46

0

Best to write reproducible code to begin with. One answer is:

df <- data.frame( V1 = c("Y","Y","N"))
df$V1 <- ifelse(df$V1 == "Y", 1,  ifelse(df$V1 == "N", 0, "X") )

answered Feb 18 '15 at 15:46

puslet88

1,288
15
25

Rename multiple field values in one statement

3 Answers3

Linked