1

I have a set of data which includes the gender. But instead of "female" and "male", I have "female","f","male" and "m" 4 categories. I'm trying to replace all the "f" and "m" with "female" and "male" respectively.

What command do I use here?

The data look like this:

dt <- data.frame(...)

     Gender     Age    
1    female     24          
2         m     38      
3    female     29      
4         m     33      
5         m     49      
6         f     29      
7         f     26      
8         f     36      
9    female     58      
10        f     31      
11   female     31      
12        f     29      
13   female     19      
14     male     38      
15   female     63      

and I tried this code:

dt$Gender <- dt$Gender(c("female","female","male","male"))  

but it says error.

joran
  • 169,992
  • 32
  • 429
  • 468
  • 2
    Welcome to SO. You should try to provide a [minimally reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) to make it easier for us to help you. – BrodieG Feb 04 '14 at 18:07

2 Answers2

1

Since you mention factor in your title, did you look at the factor function?

x <- c("female", "f", "male", "m", "f", "undeclared")
y <- factor(x)
y
# [1] female     f          male       m          f          undeclared
# Levels: f female m male undeclared
levels(y) <- list("female" = c("female", "f"),
                  "male" = c("male", "m"),
                  "undeclared" = "undeclared")
y
# [1] female     female     male       male       female     undeclared
# Levels: female male undeclared
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
0
> #Use first 2 and last 4 cases in your data to demonstrate
> tt <- data.frame(Gender = as.factor(c("female", "m", "f", "female", "male", "female")), Age = as.numeric(c(24,38,29,19,38,63)))
> tt
  Gender Age
1 female  24
2      m  38
3      f  29
4 female  19
5   male  38
6 female  63

> str(tt)   # Check the structure of the data, make sure Gender is a factor
'data.frame':   6 obs. of  2 variables:
 $ Gender: Factor w/ 4 levels "f","female","m",..: 2 3 1 2 4 2
 $ Age   : num  24 38 29 19 38 63

> levels(tt$Gender)   # Show the levels of factor "Gender"
[1] "f"      "female" "m"      "male"  

> levels(tt$Gender) <- c("female","female","male","male")  
  # Assign new levels you want -- make sure they are in the same order as the old ones

> levels(tt$Gender)   # Now the identical levels are combined 
[1] "female" "male" 

** You can use as.factor() to change the variable class if Gender is not a factor.

Min
  • 1