3

I'm trying to write a RegEx for names. Names can optionally start with a title (Dr., Mrs., etc) and otherwise contain two or three names with the middle name optionally abbreviated in the form (X.)

For instance the following names should be matched:

  • Dr. Jeff T. Walker
  • Susan B. Anthony
  • Mr. Michael Binghamton
  • Mrs. George Bush

The following should not be matched

  • Garfield
  • Dr. J
  • T. Pain
  • The United States of America
  • February 15 2020

Here is what I have:

^(Dr\.|Mr\.|Mrs\.)?[A-Z][a-z]+\s([A-Z][a-z]+|[A-Z]\.)\s[A-Z][a-z]+?

im not quite sure where I'm going wrong here.

Ted
  • 487
  • 2
  • 12
  • 23

1 Answers1

3

^((Dr|Mr|Mrs)\. )?[A-Z][a-z]+( [A-Z]([a-z]+|\.))? [A-Z][a-z]+

This is what I did to fix it:

  • Added a space after the prefix - before, you were matching things like "Dr.James", rather than "Dr. James"
  • Removed question mark at the end, after the last name - when not after a parentheses, ? results in "lazy matching" - matching as few characters as possible (in this case, 1)
  • made the middle name optional
  • Removed some redundancies (such as in the prefix and middle name)
  • replaced \s with spaces - it's easier to read, and \s matches tabs, newlines, etc.
Community
  • 1
  • 1
Arithmomaniac
  • 4,604
  • 3
  • 38
  • 58
  • aha! you're a miracle worker here... been struggling with this for 30+ min. thank you for your thorough response & streamlining my existing code! – Ted Mar 05 '13 at 19:42
  • 2
    You may also want the $ at the end so you can't match something ridiculous like `Dr. Jeff T. Walker !"£$%^&*()0987654321` – c24w Mar 05 '13 at 19:43