1
ethnicity_col_names <- c("surname", "first_name", "surname.match", "white", "black",
                     "hispanic", "asian", "other")
colnames(ethnicity_sample) <- ethnicity_col_names
ethnicity_sample$try <- pmax(ethnicity_sample$white, ethnicity_sample$black, ethnicity_sample$hispanic,
            ethnicity_sample$asian, ethnicity_sample$other)

Each one of the ethnicity categories returns a % likelihood of the person belonging to that ethnicity. When I use the pmax function, it returns the highest % (in numbers). I want it to return the name of the column with the ethnicity with the highest % match.

user213544
  • 2,046
  • 3
  • 22
  • 52

1 Answers1

0

We can use max.col to return the index of the columns with the max value for each row

nm1 <- c("white", "black", "hispanic", "asian", "other")
ethnicity_sample$try  <- nm1[max.col(ethnicity_sample[nm1], 'first')]
akrun
  • 874,273
  • 37
  • 540
  • 662