Here is why your code didn't work
df$V1[df$V1 == c("Y","N")] <- c(1,0)
is asking R to match values of V1 where it is a vector of 2 value of "Y" and "N"
if you want to get either "Y" or "N", you can do
df$V1[df$V1 %in% c("Y", "N")] <- c(1,0
In your case, I might consider using factors in R. Factors are categories. Levels in the factors is like a summary of the vector, telling you what unique values/factors are in the vector. The function levels(x)
gives you the level of vector x
So if you have a vector that looks like this : x<-c('Male', 'Male','Male','Female','Female','Female')
you will see that it is made out of 2 repeated items 'Male', 'Female'
if you run levels(x)
you will get
[1] Male Male Male Female Female Female
levels: Male Female
and when you run levels(x) <- c('M','F')
you'll get
[1] M M M F F F
levels: M F
For instance, if you have a given the following dataframe :
V1 <- rep(letters[1],10, letters[4],8) ## first column consist of 10 'a' and 8'd'
V2 <- 1:18
df <- data.frame(V1, V2)
levels(df$V1) <- c('A','D') # replace all 'a' with 'A' and all 'd' with 'D'
I think this is the platonic way to do replacement.
Another way if you will want to only replace specific value, I will suggest you to write a function that works like a hash, and do apply over the dataframe.
This technique is used in ggplot to replace labels in facet_wrap http://www.cookbook-r.com/Graphs/Facets_(ggplot2)/
But this means you will end up writing more lines of code although I think it will appear nicer