1

I have a field that has first and last names. Some names include a middle initial, some names include a suffix.

I am trying to find a formula that only pulls the last name regardless of which format it is in.

Example format

    Donald P Bellisario --> Bellisario
    Dale Earnhardt Jr --> Earnhardt
    Jimmy M Butler III--> Butler
    Kanye E West--> West
    Joseph Biden--> Biden


Formula 1: =TRIM(RIGHT(SUBSTITUTE(AS9," ",REPT(" ",LEN(AS9))),LEN(AS9)))

Formula 2:=RIGHT(AS9,LEN(AS9)-FIND("*",SUBSTITUTE(AS9," ","*",LEN(AS9)-LEN(SUBSTITUTE(AS9," ","")))))

Formula 1 and 2 do not ignore suffixes and will list those if existent Jack Smith Jr--> Jr

Formula 3: =SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(TRIM(SUBSTITUTE($AS9,IFERROR(RIGHT($AS9,LEN(AS9)-FIND(" ",$AS9)-10),""),""))," ",REPT(" ","99")),99)),",","")

Formula 3 will only include 10 characters after the end of the first name without displaying the middle initial. E.G(Heisenberger--> Heisenberg)

Merk
  • 33
  • 9

1 Answers1

1

Truth is, working with names can be subject to various edge-cases that will prove a working solution wrong at some point. But for those samples shown I'd use FILTERXML() to "split" these input strings on the spaces and use xpath expressions to filter out those substrings:

enter image description here

Formula in B1:

=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''][translate(., 'aeiouAEIOU', '')!=.][last()]")

The trick here is that there are three coherent xpath expressions working together:

  • [translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ.', '')!=''] - Assert that node is not nothing when all uppercase and dots have been substituted with nothing.
  • [translate(., 'aeiouAEIOU', '')!=.] - Assert that node is not equal to its original node when all vowels (upper- and lowercase) have been substituted with nothing.
  • [last()] - The last() function returns an integer equal to the context size from the expression evaluation context, and thus it will return the last node that compiled testing against previous expressions.

I'd guess that depending on possible edge-cases you could add more rules to the equation. For a more comprehensive insight on these expressions you could have a look here.

Good luck.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • 1
    Worked great, there are no edge cases where it didn't work in the dataset that I can see. Thank you – Merk May 25 '21 at 12:04