I have a column containing random names. I would like to create a code that would create another column (using mutate function) that would check if the name contains the word "Mr." which would result to the new column generating "Male"
Asked
Active
Viewed 1,732 times
-1
-
2Try `df %>% mutate(newCol = ifelse(grepl("Mr\\.", othercol), "Male", othercol))` – akrun Jan 04 '18 at 09:09
-
1Possible duplicate https://stackoverflow.com/questions/19747384/how-to-create-new-column-in-dataframe-based-on-partial-string-matching-other-col – zx8754 Jan 04 '18 at 09:19
-
This is a basic question, please read [some manuals](https://stackoverflow.com/tags/r/info) – zx8754 Jan 04 '18 at 09:20
-
[Similar question](https://stackoverflow.com/questions/39903376/if-column-contains-string-then-enter-value-for-that-row), but without `dplyr::mutate`. – pogibas Jan 04 '18 at 09:28
1 Answers
1
using dplyr and stringr:
library(stringr)
library(dplyr)
df <- data.frame(name = c("Mr. Robinson", "Mrs. robinson", "Gandalf","asdMr.dfa"))
df <- df %>% mutate(male = ifelse(str_detect(df$name, fixed("Mr.")), TRUE, FALSE))
Output:
> df
name male
1 Mr. Robinson TRUE
2 Mrs. robinson FALSE
3 Gandalf FALSE
4 asdMr.dfa TRUE
Be aware that this matches the Phrase "Mr." anywhere in the string, not just the beginning. If you don't want that I'd use regular expressions:
df <- df %>% mutate(male = ifelse(str_detect(name, "^Mr\\."), TRUE, FALSE))
> df
name male
1 Mr. Robinson TRUE
2 Mrs. robinson FALSE
3 Gandalf FALSE
4 asdMr.dfa FALSE
This could also be achieved without the stringr package: (inspired by @akrun)
df <- df %>% mutate(male = ifelse(grepl("^Mr\\.", name), TRUE, FALSE))
EDIT:
@docendo discimus pointed out that the ifelse()
isn't necessary since we're creating a logical-column and that's exactly what grepl returns. So:
df <- df %>% mutate(male = grepl("^Mr\\.", name))
Without dplyr:
df <- transform(df, male = grepl("^Mr\\.", name))

f.lechleitner
- 3,554
- 1
- 17
- 35
-
5If you just return a logical true/false you don't need the `ifelse` since that's what `grepl` returns anyway. For that matter, you don't need dplyr either. It could simply be `transform(df, male = grepl("^Mr\\.", name))` – talat Jan 04 '18 at 09:27
-
hey cool, i didn't know that! makes perfect sense though, thanks – f.lechleitner Jan 04 '18 at 09:32