0

I have a column in my dataframe where gender is coded 1 and 0 for male and female respectively. It's not a replica, but looks something like this:

df <- read.csv("df.csv")

      "  Gender   Age    Width
1          0      35      1.4   
2          0      30      1.4  
3          1      32      1.3   
4          1      31      1.5    
5          0      36      1.4  
6          1      39      1.7  "


I've managed to change the class type of it to factor and gave it labels:

df$Gender <- as.factor(df$Gender)
class(df$Gender)

df$Gender <- factor(df$Gender,
levels = c("1","0"),
labels = c("male", "female"))

However, when I try to print df$Gender, I get all "NA" as my output

UPDATE: Thank you all for your help! I realised that my code works when I run it the first time. It only becomes "NA" when I rerun the second chunk. Will this be a problem or can I just ignore it?

hellooo
  • 1
  • 1

2 Answers2

0

You can use

library(tidyverse)

df %>% 
  mutate(gender = factor(Gender, labels = c("male", "female"))) 

or simply

df$gender <- ifelse(df$Gender == 1,"male","female")

or

df %>% 
  mutate(gender = if_else(Gender == 1,"male","female"))

or

df %>% 
  mutate(gender = case_when(Gender == 1 ~ "male",
                            Gender == 0 ~ "female"))

Data

df = structure(list(Sn = 1:6, Gender = c(0L, 0L, 1L, 1L, 0L, 1L), 
    Age = c(35L, 30L, 32L, 31L, 36L, 39L), Width = c(1.4, 1.4, 
    1.3, 1.5, 1.4, 1.7), gender = c("female", "female", "male", 
    "male", "female", "male")), row.names = c(NA, -6L), class = "data.frame")
UseR10085
  • 7,120
  • 3
  • 24
  • 54
-2

Everything what you have done seems correct, when I reconstruct your input

Gender <- c(1,0,1,0)
Age <- c(50,30,40,30)

df <- data.frame(Gender,Age)
df$Gender <- factor(df$Gender,
                levels = c("1","0"),
                labels = c("male", "female"))
print(df$Gender)

if you really want is as character you can then add:

df$Gender <- as.character(df$Gender)

But I think (as others already mentioned) its because of your input data, therefore try to add stringasfactors to your import command:

df <- read.csv("df.csv", stringsAsFactors = FALSE)
dholzer
  • 26
  • 8