1

I'm attempting to use a vector or a dataframe to classify the size of fish above or below a threshold size. If I want to use a standard threshold size, say "30" for all fish, I have no problems.

  sp_data <- data.frame(
  X1 = c("fish1","fish1","fish2","fish2","fish3"),
  X2 = c(20,30,32,21,50)
)

sp_data$X3 <-ifelse(sp_data$X2>=30 , "above", "below")

The above code works as intended however what I really want is to use a unique threshold for each fish. So I created this dataframe where I can list the fish and it's corresponding size threshold.

size_data <-data.frame(
      S1 = c("fish1", "fish2", "fish3"),
      S2 = c(25, 26, 30)
    )

spdata$X4 <- ifelse(spdata$X1 == sizedata$S1 & 
sizedata$S2 >= sp_data$X2, above, below)

This doesn't work, I think because its looking at all of spdata$X1 sizedata$S1 instead of row by row. Perhaps ifelse is not the best way to solve this problem but its the closest I've found so far. I'm thinking I need a loop or an apply to make this work but I'm not sure where to go from here.

MyNameisTK
  • 209
  • 1
  • 2
  • 15

2 Answers2

2

You could join the datasets together and then use the calculation. Here's the original data:

sp_data <- data.frame(
  X1 = c("fish1","fish1","fish2","fish2","fish3"),
  X2 = c(20,30,32,21,50)
)

Note, that you've got to change the S1 to X1 in the size_data object so that the variable you're merging on is the same in both datasets:

size_data <-data.frame(
      X1 = c("fish1", "fish2", "fish3"),
      S2 = c(25, 26, 30)
    )

Then you can merge them with left_join() from dplyr:

library(dplyr)
sp_data <- left_join(sp_data, size_data)

Finally, you can make the calculation you want:

sp_data$X3 <- ifelse(X2 > S2, "above", "below")
> sp_data
     X1 X2 S2    X3
1 fish1 20 25 below
2 fish1 30 25 above
3 fish2 32 26 above
4 fish2 21 26 below
5 fish3 50 30 above`
DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
2

There is a dplyr way to do this.

library(dplyr)
sp_data %>%
  inner_join(size_data, by = c("X1" = "S1")) %>%
  mutate(X4 = case_when(X2 >= S2 ~ "above",
                        TRUE ~ "below")) %>%
select(-S2)
     X1 X2    X4
1 fish1 20 below
2 fish1 30 above
3 fish2 32 above
4 fish2 21 below
5 fish3 50 above
Ben Norris
  • 5,639
  • 2
  • 6
  • 15