I'm trying to figure how to split name from surname, into two new dataframe columns.
name is always in UPPERCASE, whilst surname is in title case. Without losing information.
There are a number of Stack Overflow questions, but I'm not certain how to use them with a pandas dataframe column:
- Regex to match only uppercase “words” with some exceptions
- How to extract all UPPER from a string? Python
for example:
data = {'Naam aanvrager': ['DREGGHE Joannes', 'MAHIEU Leo', 'NIEUWENHUIJSE', 'COPPENS', 'VERBURGHT Cornelis', 'NUYTTENS Adriaen', 'DE LARUELLE Pieter', 'VAN VIJVER', 'SILBO Martinus', 'STEEMAERE Anthone']}
df = pd.DataFrame(data)
Naam aanvrager
0 DREGGHE Joannes
1 MAHIEU Leo
2 NIEUWENHUIJSE
3 COPPENS
4 VERBURGHT Cornelis
5 NUYTTENS Adriaen
6 DE LARUELLE Pieter
7 VAN VIJVER
8 SILBO Martinus
9 STEEMAERE Anthone
the wanted output (two extra columns "Name"
and "Surname"
):
name | surname |
---|---|
DREGGHE | Joannes |
MAHIEU | Leo |
NIEUWENHUIJSE | |
COPPENS | |
VERBURGHT | Cornelis |
NUYTTENS | Adriaen |
DE LAURELLE | Pieter |
VAN VIJVER | |
SILBO | Martinus |
STEEMAERE | Anthone |