I'd like a Regular Expression for C# that matches "Johnson", "Del Sol", or "Del La Range"; in other words, it should match words with spaces in the middle but no space at the start or at the end.
-
1What does the input string look like? Is the last name the only part of the string, or is it a sentence, or possibly a full name with optionally more spaces? I think context is important here. – Rich Mar 10 '09 at 21:37
7 Answers
^\p{L}+(\s+\p{L}+)*$
This regex has the following features:
- Will match a one letter last name (e.g. Malcolm X's last name)
- Will not match last names containing numbers (like anything with a
\w
or a[^ ]
will) - Matches unicode letters
But what about last names like "O'Connor" or hyphenated last names ... hmm ...

- 50,583
- 16
- 120
- 115
This should do the job:
^[a-zA-Z][a-zA-Z ]*[a-zA-Z]$
Edit: Here's a slight improvement that allows one-latter names and hyphens/apostrophes in the name:
^[a-zA-Z'][a-zA-Z'- ]*[a-zA-Z']?$

- 144,213
- 56
- 264
- 302
-
Malcolm X would not be happy about this... (requiring minimum of 2 letter last names that is...) – Daniel LeCheminant Mar 10 '09 at 21:42
-
The shortest REAL name I can think of is "Ng." Should be fine. ;) – Samantha Branham Mar 10 '09 at 21:48
-
Yeah, I noticed that upon review, but didn't bother changing because I didn't consider a one-letter last name... Post is edited now anyway with a few other improvements. – Noldorin Mar 10 '09 at 23:42
-
+1 for tackling ' and -. (I don't know if the first character needs to accept an apostrophe though... or if a-- should be a valid last name) – Daniel LeCheminant Mar 11 '09 at 00:00
-
@Daniel: Cheers. And yeah, it *probably* doesn't need to accept ' as the first char, but can't hurt. Note that it shouldn't accept a hyphen as the last char, so a-b would be valid but not a-- (unless one of my quantifiers is wrong). – Noldorin Mar 11 '09 at 00:07
-
How would I change this to only allow single spaces inside the name, not more than one space? – Caveatrob Mar 11 '09 at 20:53
-
I take it you mean not more than one space in a row? Try the following (it may not quite work, as I haven't tested): ^[a-zA-Z'](([a-zA-Z])+['- ]?)*[a-zA-Z']?$ – Noldorin Mar 11 '09 at 22:04
In the name "Ṣalāḥ ad-Dīn Yūsuf ibn Ayyūb" (see http://en.wikipedia.org/wiki/Saladdin), which is the first name, and which is the last? What about in the name "Roberto Garcia y Vega" (invented)? "Chiang Kai-shek" (see http://en.wikipedia.org/wiki/Chang_Kai-shek)?
Spaces in names are the least of your problems! See Personal names in a global application: What to store.

- 1
- 1

- 160,644
- 26
- 247
- 397
-
I agree. No matter how hard you try you will always find names that don't match correctly. I mean, if you don't have complete control on what names you are parsing. – Sergio Acosta Mar 10 '09 at 22:36
Here's a better one:
/^[a-zA-Z]+(([\'\,\.\- ][a-zA-Z ])?[a-zA-Z]*)*$/
Allows standard punctuation and spaces, but cannot start with punctuation.

- 7,612
- 14
- 77
- 127
The ? qualifier is your friend. Makes a shortest-possible match instead of a greedy one. Use it for the first name, as in:
^(.+?) (.+)$
Group 1 grabs everything up to the first space, group 2 gets the rest.
Of course, now what do you do if the first name contains spaces?

- 36,322
- 27
- 84
- 93
-
Nice and simple, but I think it will match "238 39592" as well, which aren't words. – Samantha Branham Mar 10 '09 at 21:26
-
-
Not sure if the OP wants to match the last name by itself or within a string containing both the first and last names... I supposed the former, while you seem to have done the latter. Still, it appears your regex allows spaces at the start or end, which needs to be fixed. – Noldorin Mar 10 '09 at 21:40
I think this is more what you were looking for:
^[^ ][a-zA-Z ]+[^ ]$
This should match the beginning of the line with no space, alpha characters or a space, and no space at the end.
This works in irb, but last time I worked with C#, I've used similar regexes:
(zero is good, nil means failed)
>> "Di Giorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0
>> "DiGiorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0
>> " DiGiorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> nil
>> "DiGiorno " =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> nil
>> "Di Gior no" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0

- 169
- 2
-
1Using the [^ ] will match last names starting or ending with numbers, punctuation, etc... – Daniel LeCheminant Mar 10 '09 at 21:38
-
Danny's right. I responded with the same solution and retracted it when I realized this. – Samantha Branham Mar 10 '09 at 21:49