0

I'm trying to run a nested ifelse statement in R. Here's a look at the structure of my data using the glimpse() function from the tidyverse:

Rows: 22,104
Columns: 9
$ `Formation/Locality`    <chr> "Montmartre", "Montmartre", "Montmartre", "Fur", "Me...
$ Location                <chr> "Ile-de-France Region, France", "Ile-de-France Regio...
$ Environment             <chr> "terrestrial", "terrestrial", "terrestrial", "offsho...
$ `Palaeolongitude(N/-S)` <dbl> 47.4, 47.4, 47.4, 52.3, 46.9, 42.9, 47.5, 46.9, 46.2...
$ `Palaeolatitude(E/-W)`  <dbl> 1.6, 1.6, 1.6, 5.4, 4.8, 1.9, -5.2, 4.8, -93.6, -111...
$ TaxonomicLevel          <chr> "Order", "Order", "Order", "Order", "Order", "Order"...
$ TaxonomicName           <chr> "Upupiformes", "Upupiformes", "Upupiformes", "Trogon...
$ MinMax                  <chr> "MaxMa", "MaxMa", "MaxMa", "MaxMa", "MaxMa", "MaxMa"...
$ Age                     <dbl> 37.2, 37.2, 37.2, 55.8, 48.6, 37.2, 48.6, 48.6, 55.8...

I'm trying to get R to look at the Age column and if the value is within a certain range, for it to put the geological age name into a new column called AgeName. If the value is not within the range, I want it to move on to the next age range and so on and so forth. Here's my code so far:

pbdb_tidyish$AgeName <- ifelse(56>=pbdb_tidyish$Age&&47.8<pbdb_tidyish$Age,
                               "Ypresian",
                               ifelse(47.8>=pbdb_tidyish$Age&&41.2<pbdb_tidyish$Age,
                                      "Lutetian",
                                      ifelse(41.2>=pbdb_tidyish$Age&&37.8<pbdb_tidyish$Age,
                                             "Bartonian",
                                             ifelse(37.8>=pbdb_tidyish$Age&&33.9<=pbdb_tidyish$Age,
                                                    "Priabonian",NA))))

When I run this code, it creates the new column but fills the whole column with "Priabonian" so the dataset now looks like this:

Rows: 22,104
Columns: 10
$ `Formation/Locality`    <chr> "Montmartre", "Montmartre", "Montmartre", "Fur", "Me...
$ Location                <chr> "Ile-de-France Region, France", "Ile-de-France Regio...
$ Environment             <chr> "terrestrial", "terrestrial", "terrestrial", "offsho...
$ `Palaeolongitude(N/-S)` <dbl> 47.4, 47.4, 47.4, 52.3, 46.9, 42.9, 47.5, 46.9, 46.2...
$ `Palaeolatitude(E/-W)`  <dbl> 1.6, 1.6, 1.6, 5.4, 4.8, 1.9, -5.2, 4.8, -93.6, -111...
$ TaxonomicLevel          <chr> "Order", "Order", "Order", "Order", "Order", "Order"...
$ TaxonomicName           <chr> "Upupiformes", "Upupiformes", "Upupiformes", "Trogon...
$ MinMax                  <chr> "MaxMa", "MaxMa", "MaxMa", "MaxMa", "MaxMa", "MaxMa"...
$ Age                     <dbl> 37.2, 37.2, 37.2, 55.8, 48.6, 37.2, 48.6, 48.6, 55.8...
$ AgeName                 <chr> "Priabonian", "Priabonian", "Priabonian", "Priabonia...

Does anyone have any idea where I'm going wrong? I think it's just looking at the first Age value, running the ifelse statement then filling the whole column with the result of that as opposed to moving on to the next row.

Thanks,

Carolina

mnist
  • 6,571
  • 1
  • 18
  • 41

3 Answers3

3

Without data it is unclear whether this is the only error but you should not use && here since it is not vectorized. That means, it only checks the value in the first row, returns either TRUE or FALSE, based on this single observation only, and recycles this value.

Use & instead.

For a comparison see this answer

mnist
  • 6,571
  • 1
  • 18
  • 41
1

I think whenever you find yourself writing nested ifelse statements you should stop and ask yourself whether there might be a better way to achieve what you're trying to do. For example, the following single function call does what you are trying to achieve, and is easier to understand and maintain:

cut(pdb$tidyish, breaks = c(33.9, 37.8, 41.2, 47.8, 56),
    labels = c("Priabonian", "Bartonian", "Lutetian", "Ypresian"))

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
1

You're already using the tidyverse, you should make yourself familiar with case_when:

pbdb_tidyish <- pbdb_tidyish %>%
  mutate(AgeName = case_when(
    (Age >= 33.9 & Age <= 37.8) ~ 'Priabonian',
    (Age > 37.8 & Age <= 41.2) ~ 'Bartonian',
    (Age > 41.2 & Age <= 47.8) ~ 'Lutetian',
    (Age > 47.8 & Age <= 56) ~ 'Ypresian',
  ))
mmyoung77
  • 1,343
  • 3
  • 14
  • 22