Optimizing pandas operation: combining first/middle/last name columns

Question

Let's say I take a sample of names such as these separated by individual fields:

indx  First Name   Middle Name     Last Name
0     CHARITIXAN   K.R.,           NICHOLS
1           None   Johnny-Boy      CHAVEZ
2          ISAAC   None            ESPARZA
3        MICHAEL   nan             
4         Andrew                   Pfaff

Let's also assume these data are formatted as a pandas dataframe (df) and enough cleaning (via the .replace method) has been done to where all values that remain are only occupied strings or empty strings.

indx  First Name   Middle Name     Last Name
0     CHARITIXAN   K.R.,           NICHOLS
1                  Johnny-Boy      CHAVEZ
2          ISAAC                   ESPARZA
3        MICHAEL               
4         Andrew                   Pfaff

I want to properly combine all part of a given name with ONLY a single space between each name segment. Based on my research and implementation, the best solution I found was this - the one were re is used. Is this the optimal way or is there something better for this particular case?

My final approach was this:

df['full_name']=df[['First Name', 'Middle Name', 'Last Name']].apply(lambda x: re.sub(' +', ' ', ' '.join(x)), axis=1)

can't you just add them together `df['full_name']=df['First Name'] +' ' + df['Middle Name'] + ' ' + df['Last Name']` — Kenan, Jan 20 '20 at 14:47
@kenan that's not "ONLY a single space" if middle or last name are empty. — anishtain4, Jan 20 '20 at 14:52
assuming names is a list of your columns `df[names].apply(lambda x : x.str.cat(sep=' '),axis=1)` — Umar.H, Jan 20 '20 at 15:07

score 4 · Accepted Answer · answered Jan 20 '20 at 14:59

4

You can apply join as:

df['full_name'] = df[['First Name','Middle Name', 'Last Name']].apply(lambda x: ' '.join(x), axis=1)

answered Jan 20 '20 at 14:59

anishtain4

2,342
2
17
21

score 1 · Answer 2 · answered Apr 01 '23 at 17:08

1

You can use this

df['full_name'] = df.apply(lambda row: row['First Name'] + ' ' + row['Middle Name'] + ' ' + row['Last Name'], axis=1)

answered Apr 01 '23 at 17:08

Ved Prakash Shukla

31
3

Optimizing pandas operation: combining first/middle/last name columns

2 Answers2