Regex to grab full firstname and first letter of last name

Question

I have a list of users grabbed by the Etc Ruby library:

Thomas_J_Perkins

Jennifer_Scanner

Amanda_K_Loso

Aaron_Cole

Mark_L_Lamb

What I need to do is grab the full first name, skip the middle name (if given), and grab the first character of the last name. The output should look like this:

Thomas P

Jennifer S

Amanda L

Aaron C

Mark L

I'm not sure how to do this, I've tried grabbing all of the characters: /\w+/ but that will grab everything.

Define "first name" and "last name". In what culture? Don't assume that a first name occurs first; You can inadvertently insult a customer by not processing their name correctly. Read "[ask]" including the links, and "[mcve]". We expect to see evidence of your effort. As is it looks like you haven't tried and want us to write the code for you, which is off-topic, or to write a tutorial for you, which again is off-topic. — the Tin Man, May 09 '16 at 16:09

score 6 · Answer 1 · answered May 09 '16 at 15:51

6

I think its simpler without regex:

array = "Thomas_J_Perkins".split("_") # split at _
array.first + " " + array.last[0] # .first prints first name .last[0] prints first char of last name
#=> "Thomas P"

answered May 09 '16 at 15:51

shivam

16,048
3
56
71

score 6 · Accepted Answer · answered May 09 '16 at 15:51

6

You don't always need regular expressions.

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. Jamie Zawinski

You can do it with some simple Ruby code

string = "Mark_L_Lamb"
string.split('_').first + ' ' + string.split('_').last[0]
=> "Mark L"

answered May 09 '16 at 15:51

SteveTurczyn

36,057
6
41
53

Was just typing the same. – AM Douglas May 09 '16 at 15:53
Good answer, but calling `string.split("_")` twice is needless. – Jordan Running May 09 '16 at 16:00
@amdouglas Would that still work even if there's no middle initial? – JasonBorne May 09 '16 at 16:05
1

@JasonBorne yes, it takes the full first word and the first letter of the last word. It doesn't care if you have two or three words. – SteveTurczyn May 09 '16 at 16:10

Wiktor Stribiżew · Answer 3 · 2016-05-09T15:56:12.883

You can use

^([^\W_]+)(?:_[^\W_]+)*_([^\W_])[^\W_]*$

And replace with \1_\2. See the regex demo

The [^\W_] matches a letter or a digit. If you want to only match letters, replace [^\W_] with \p{L}.

^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$

See updated demo

The point is to match and capture the first chunk of letters up to the first _ (with (\p{L}+)), then match 0+ sequences of _ + letters inside (with (?:_\p{L}+)*_) and then match and capture the last word first letter (with (\p{L})) and then match the rest of the string (with \p{L}*).

NOTE: replace ^ with \A and $ with \z if you have independent strings (as in Ruby ^ matches the start of a line and $ matches the end of the line).

Ruby code:

s.sub(/^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$/, "\\1_\\2")

[Please don't restrict names to only word characters.](http://stackoverflow.com/questions/2385701/regular-expression-for-first-and-last-name/32517316#32517316) `[^_]` instead of `[^\W_]` or `\p{L}` is perfectly acceptable here. — Aran-Fey, May 09 '16 at 17:00

Cary Swoveland · Answer 4 · 2016-05-09T18:52:49.560

I'm in the don't-use-a-regex-for-this camp.

str1 = "Alexander_Graham_Bell"
str2 = "Sylvester_Grisby"

"#{str1[0...str1.index('_')]} #{str1[str1.rindex('_')+1]}"
  #=> "Alexander B"
"#{str2[0...str2.index('_')]} #{str2[str2.rindex('_')+1]}"
  #=> "Sylvester G"

or

first, last = str1.split(/_.+_|_/)
  #=> ["Alexander", "Bell"] 
first+' '+last[0]
  #=> "Alexander B" 

first, last = str2.split(/_.+_|_/)
  #=> ["Sylvester", "Grisby"] 
first+' '+last[0]
  #=> "Sylvester G"

but if you insist...

r = /
    (.+?)     # match any characters non-greedily in capture group 1
    (?=_)     # match an underscore in a positive lookahead 
    (?:.*)    # match any characters greedily in a non-capture group 
    (?:_)     # match an underscore in a non-capture group
    (.)       # match any character in capture group 2
    /x        # free-spacing regex definition mode

str1 =~ r
$1+' '+$2
  #=> "Alexander B"

str2 =~ r
$1+' '+$2
  #=> "Sylvester G"

You can of course write

r = /(.+?)(?=_)(?:.*)(?:_)(.)/

score 0 · Answer 5 · answered May 09 '16 at 15:51

0

This is my attempt:

/([a-zA-Z]+)_([a-zA-Z]+_)?([a-zA-Z])/

See demo

answered May 09 '16 at 15:51

xuanduc987

1,067
1
9
10

score 0 · Answer 6 · answered May 09 '16 at 15:52

0

Let's see if this works:

/^([^_]+)(?:_\w)?_(\w)/

And then you'll have to combine the first and second matches into the format you want. I don't know Ruby, so I can't help you there.

answered May 09 '16 at 15:52

IslandUsurper

11
1
3

Ron Rosenfeld · Answer 7 · 2016-05-09T16:12:43.377

0

And another attempt using a replacement method:

result = subject.gsub(/^([^_]+)(?:_[^_])?_([^_])[^_]+$/, '\1 \2')

We capture the entire string, with the relevant parts in capturing groups. Then just return the two captured groups

edited May 09 '16 at 16:12

answered May 09 '16 at 16:07

Ron Rosenfeld

53,870
7
28
60

score 0 · Answer 8 · answered May 09 '16 at 16:22

0

using the split method is much better

full_names.map do |full_name|
   parts = full_name.split('_').values_at(0,-1)
   parts.last.slice!(1..-1)
   parts.join(' ')
end

answered May 09 '16 at 16:22

Nafaa Boutefer

2,169
19
26

I suggest `first, last = full_name.split('_').values_at(0,-1); first+last[0]`. – Cary Swoveland May 09 '16 at 16:44

score -1 · Answer 9 · answered May 09 '16 at 16:23

-1

/^[A-Za-z]{5,15}\s[A-Za-z]{1}]$/i This will have the following criteria: 5-15 characters for first name then a whitespace and finally a single character for last name.

answered May 09 '16 at 16:23

Avneesh Srivastava

103
6

[Never, ever restrict names to word characters.](http://stackoverflow.com/a/32517316/1222951) Also, people named "Bill" or "Paul" or "Anne" might have a problem with the 5-15 character criteria. And what's that `\s` doing in your pattern anyway? – Aran-Fey May 09 '16 at 16:51
\s denotes a whitespace so that there is a white space between the first name and the last name. If length is an an issue then you can use {,upperLimit} anytime. – Avneesh Srivastava May 09 '16 at 16:54
The point is, your pattern doesn't work because of the `\s`. You want to match an underscore, not a space. – Aran-Fey May 09 '16 at 16:56
In that case: /^[A-Za-z]{2,15}[_]{1}[A-Za-z]{1}]$/i – Avneesh Srivastava May 09 '16 at 17:04
I'm starting to think that you misunderstood OP's question. Your pattern isn't supposed to match names like "Mark L", it's supposed to turn "Mark_L_Lamb" _into_ "Mark L". – Aran-Fey May 09 '16 at 17:08

Regex to grab full firstname and first letter of last name

9 Answers9