-1

The 2000 names I have are mixed with "first name middle name last name" and "first name last name". My code only works with those with middle names. Please see the toy example.

names <- c("SARAH AMY SMITH", "JACKY LEE", "LOVE JOY", "MONTY JOHN CARLO", "EVA LEE-YOUNG")
last.name <- gsub("[A-Z]+ [A-Z]*","\\", people.from.sg[,7])

last.name is

" SMITH" "" " CARLO" "-YOUNG"

LOVE JOY and JACKY lEE don't have any results.

p.s This is not a duplicate post since the previous ones do not use gsub

YCM
  • 11
  • 1
  • 3

2 Answers2

3

Replace everything up to the last space with the empty string. No packages are used.

sub(".* ", "", names)
## [1] "SMITH"     "LEE"       "JOY"       "CARLO"     "LEE-YOUNG"

Note:

Regarding the comment below on two word last names that does not appear to be part of the question as stated but if it were then suppose the first word is either DEL or VAN. Then replace the space after either of them with a colon, say, and then perform the sub above and then revert the colon back to space.

names2 <- c("SARAH AMY SMITH", "JACKY LEE", "LOVE JOY", "MONTY JOHN CARLO", 
"EVA LEE-YOUNG", "ARTHUR DEL GATO", "MARY VAN ALLEN") # test data

sub(":", " ", sub(".* ", "", sub(" (DEL|VAN) ", " \\1:", names2)))
## [1] "SMITH"     "LEE"       "JOY"       "CARLO"     "LEE-YOUNG" "DEL GATO" 
## [7] "VAN ALLEN"
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
2

Alternatively, extract everything after the last space (or last

library(stringr)
str_extract(names, '[^ ]+$')
# [1] "SMITH"     "LEE"       "JOY"       "CARLO"     "LEE-YOUNG"

Or, as mikeck suggests, split the string on spaces and take the last word:

sapply(strsplit(names, " "), tail, 1)
# [1] "SMITH"     "LEE"       "JOY"       "CARLO"     "LEE-YOUNG"
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294