0

For example, I have a dataset named "animals". animals = read_csv("animals.csv")

It contains a column called "Species", where the value of Species can be "a","b","c". See below.

Weight Height Species
    <dbl>  <dbl> <chr>  
 1   12.3   15.7 a      
 2    8.1   17.9 a      
 3   11.3   14.4 b      
 4   12.1   18   b      
 5    6.8   18.1 a      
 6   16.3   18.3 a      
 7   19.6   19.2 c      
 8   22.2   19   c     

Now I want to rename the values of a,b,c. a is "tiger", b is "lion", c is "elephant" so it becomes below. And I need to use factor function to rename them, how can I do it please?

Weight Height Species
    <dbl>  <dbl> <chr>  
 1   12.3   15.7 tiger      
 2    8.1   17.9 tiger      
 3   11.3   14.4 lion      
 4   12.1   18   lion      
 5    6.8   18.1 tiger      
 6   16.3   18.3 tiger      
 7   19.6   19.2 elephant      
 8   22.2   19   elephant
Phil
  • 7,287
  • 3
  • 36
  • 66
Subaru Spirit
  • 394
  • 3
  • 19

3 Answers3

3

Here is a base R option using factor

transform(
  df,
  Species = c('tiger', 'lion', 'elephant')[as.integer(factor(Species,levels = c('a', 'b', 'c')))]
)

which gives

  Weight Height  Species
1   12.3   15.7    tiger
2    8.1   17.9    tiger
3   11.3   14.4     lion
4   12.1   18.0     lion
5    6.8   18.1    tiger
6   16.3   18.3    tiger
7   19.6   19.2 elephant
8   22.2   19.0 elephant
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
2

Here is an option with left_join

keydat <- data.frame(Species = c('a', 'b', 'c'), 
       value = c('tiger', 'lion', 'elephant'))
library(dplyr)
df1 <- df1 %>%
    left_join(keydat) %>%
           mutate(Species = value, value = NULL)

-output

df1
#  Weight Height  Species
#1   12.3   15.7    tiger
#2    8.1   17.9    tiger
#3   11.3   14.4     lion
#4   12.1   18.0     lion
#5    6.8   18.1    tiger
#6   16.3   18.3    tiger
#7   19.6   19.2 elephant
#8   22.2   19.0 elephant

Or using a named vector to match and replace

nm1 <- setNames(c('tiger', 'lion', 'elephant'),
       c('a', 'b', 'c'))
df1$Species <- nm1[df1$Species]

Or an option with fct_recode in case if the column is factor (should also work with character class)

library(forcats)     
df1$Species <- fct_recode(factor(df1$Species), !!!setNames(names(nm1), nm1) )

data

df1 <- structure(list(Weight = c(12.3, 8.1, 11.3, 12.1, 6.8, 16.3, 19.6, 
22.2), Height = c(15.7, 17.9, 14.4, 18, 18.1, 18.3, 19.2, 19), 
    Species = c("a", "a", "b", "b", "a", "a", "c", "c")), class = "data.frame",
    row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8"))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

If Species isn't already a factor, you can turn it into one by animals$Species = as.factor(animals$Species)

Then, you can change the levels by doing:

levels(animals$Species) = c("tiger","lion","elephant")
  • Thanks! This is what I wanted. One more question from me. I tested using below which works as well, is there any difference between as.factor and factor? animals$Species = factor(animals$Species) levels(animals$Species) = c("tiger","lion","elephant") – Subaru Spirit Oct 21 '20 at 19:45
  • Apparently as.factor woks best in some cases. If you're curious, check: https://stackoverflow.com/questions/39279238/why-use-as-factor-instead-of-just-factor#:~:text=%23%5B1%5D%20TRUE-,as.,see%20what%20it%20has%20done.&text=It%20first%20sort%20the%20unique,vector%20back%20to%20a%20factor. – Ricardo Semião e Castro Oct 21 '20 at 19:56