0

In my dataframe there are 500 columns in the pattern similar to

GroupNoIDFirstNameLastName (for eg:Column Name=Gr1234AdamSmith,Gr2567DavidBlake.......)

where Gr1, Gr2 = GroupNo; 234, 567 = ID; AdamSmith, DavidBlake = FirstNameLastName

I would like to rename all 500 columns to display their first and Last name Only.

similar to Adam Smith, David Blake

How would I be able to do this?

Thanks you so much for your help.

2 Answers2

1

Strip the group first by list slice [6:], then from How to do CamelCase split in python, you can split first and last name, then user ' '.join() to get what you want:

import re
def camel_case_split(identifier):
    matches = re.finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier)
    return [m.group(0) for m in matches]

#rename (reassign)
df.columns = df.columns.map(lambda x: ' '.join(camel_case_split(x[6:])))
SCKU
  • 783
  • 9
  • 14
0

You can use the rename method of Pandas

import re

def get_first_last(string):
    # 1. Get the last item of split on number of group and ID with regex -> ['Gr', 'AdamSmith']
    fullname = re.split(r'[0-9]+', string)[-1]
    # 2. Match the consecutive camelCase characters and add space between them
    first_last = re.sub(r'([a-z])([A-Z])', '\g<1> \g<2>', fullname)
    return first_last
   
# Get first and lastname for each column in dataframe and rename columns inplace
df.rename(columns={c: get_first_last(c) for c in df.columns}, inplace=True)
chatax
  • 990
  • 3
  • 17