-1

I have a column containing random names. I would like to create a code that would create another column (using mutate function) that would check if the name contains the word "Mr." which would result to the new column generating "Male"

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 2
    Try `df %>% mutate(newCol = ifelse(grepl("Mr\\.", othercol), "Male", othercol))` – akrun Jan 04 '18 at 09:09
  • 1
    Possible duplicate https://stackoverflow.com/questions/19747384/how-to-create-new-column-in-dataframe-based-on-partial-string-matching-other-col – zx8754 Jan 04 '18 at 09:19
  • This is a basic question, please read [some manuals](https://stackoverflow.com/tags/r/info) – zx8754 Jan 04 '18 at 09:20
  • [Similar question](https://stackoverflow.com/questions/39903376/if-column-contains-string-then-enter-value-for-that-row), but without `dplyr::mutate`. – pogibas Jan 04 '18 at 09:28

1 Answers1

1

using dplyr and stringr:

library(stringr)
library(dplyr)

df <- data.frame(name = c("Mr. Robinson", "Mrs. robinson", "Gandalf","asdMr.dfa"))

df <- df %>% mutate(male = ifelse(str_detect(df$name, fixed("Mr.")), TRUE, FALSE))

Output:

> df
           name  male
1  Mr. Robinson  TRUE
2 Mrs. robinson FALSE
3       Gandalf FALSE
4     asdMr.dfa  TRUE

Be aware that this matches the Phrase "Mr." anywhere in the string, not just the beginning. If you don't want that I'd use regular expressions:

df <- df %>% mutate(male = ifelse(str_detect(name, "^Mr\\."), TRUE, FALSE))
> df
           name  male
1  Mr. Robinson  TRUE
2 Mrs. robinson FALSE
3       Gandalf FALSE
4     asdMr.dfa FALSE

This could also be achieved without the stringr package: (inspired by @akrun)

df <- df %>% mutate(male = ifelse(grepl("^Mr\\.", name), TRUE, FALSE))

EDIT:

@docendo discimus pointed out that the ifelse() isn't necessary since we're creating a logical-column and that's exactly what grepl returns. So:

df <- df %>% mutate(male = grepl("^Mr\\.", name))

Without dplyr:

df <- transform(df, male = grepl("^Mr\\.", name))
f.lechleitner
  • 3,554
  • 1
  • 17
  • 35
  • 5
    If you just return a logical true/false you don't need the `ifelse` since that's what `grepl` returns anyway. For that matter, you don't need dplyr either. It could simply be `transform(df, male = grepl("^Mr\\.", name))` – talat Jan 04 '18 at 09:27
  • hey cool, i didn't know that! makes perfect sense though, thanks – f.lechleitner Jan 04 '18 at 09:32