0

I have a dataframe of data for 400,000 trees of 6 different species. Each species is assigned a numeric species code that corresponds with a specific species. I would like to add another column listing the scientific name of each tree. The species codes are not consecutive, as this data was filtered down from 490,000 trees of 163 species based on abundance. Here is an example of data similar to what I have:

Index    Age    Species_code
0        45     14
1        47     32
2        14     62
3        78     126
4        40     14
5        38     17 
6        28     47

And here is an example of what I would like to get to:

Index    Age    Species_code    Species
0        45     14              Licania_heteromorpha
1        47     32              Pouteria_reticulata
2        14     62              Chrysophyllum_cuneifolium
3        78     126             Eperua_falcata
4        40     14              Licania_heteromorpha
5        38     17              Simaba_cedron
6        28     47              Sterculia_pruriens

I have been trying things along the lines of

if (Species_code == 14)
{
}

However, this gives me TRUE or FALSE in the output

sethparker
  • 85
  • 6
  • 2
    You almost certainly what to merge/join your data. See this question for how to do that: https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right. Just make sure you also have a table that has a row for everyone species code and species name. – MrFlick Jan 17 '20 at 17:27
  • 1
    Yeah, do you have another table which is a unique list of species codes and specie names? – SharpSharpLes Jan 17 '20 at 17:28
  • I do not have a table of just the species names and corresponding codes – sethparker Jan 17 '20 at 18:55

3 Answers3

1

One solution would be to use mutate with case_when if you know which numbers correspond to what Species, I have filled out some of them which gives the code to follow on:

library(tidyverse)
x <-"
  Index    Age    Species_code
0        45     14
1        47     32
2        14     62
3        78     126
4        40     14
5        38     17 
6        28     47"
y <- read.table(text = x, header = TRUE)
y <- y %>% 
  mutate(species = case_when(Species_code == 14 ~ "Licania_heteromorpha",
                             Species_code == 32 ~ "Pouteria_reticulata",
                             Species_code == 62 ~"Chrysophyllum_cuneifolium"))   #etc...
y
#   Index Age Species_code                   species
# 1     0  45           14      Licania_heteromorpha
# 2     1  47           32       Pouteria_reticulata
# 3     2  14           62 Chrysophyllum_cuneifolium
# 4     3  78          126                      <NA>
# 5     4  40           14      Licania_heteromorpha
# 6     5  38           17                      <NA>
# 7     6  28           47                      <NA>

Although if you have a separate dataset of species and codes, it would make more sense to merge.

user63230
  • 4,095
  • 21
  • 43
0

You may want to use the ifelse() function.

You may also want to use:

my_names <- numeric()
my_names[47] <- "Licania_heteromorpha"
my_names[63] <- "Chrysophyllum_cuneifolium"
...
df$Species <- names[df$Species_code]

You may yet also have a look at dplyr numerous functions for that, like case_when and recode. See: https://dplyr.tidyverse.org/reference.

Arthur
  • 1,208
  • 13
  • 25
0

As your problem have only 6 especies, you can do this:

df$Species = NULL

df$Species[df$Species_code == 14] = 'Licania_heteromorpha'
df$Species[df$Species_code == 32] = 'Pouteria_reticulata'
.....
Filipe Lauar
  • 434
  • 3
  • 8