Python: Add 0/Zero in a string inside a cell

Question

I have this sample data in a cell:

EmployeeID

2016-CT-1028
2016-CT-1028
2017-CT-1063
2017-CT-1063
2015-CT-948
2015-CT-948

So, my problem is how can I add 0 inside this data 2015-CT-948 to make it like this 2015-CT-0948. I tried this code:

pattern = re.compile(r'(\d\d+)-(\w\w)-(\d\d\d)')
newlist = list(filter(pattern.match, idList))

Just to get the match regex pattern then add the 0 with zfill() but its not working. Please, can someone give me an idea on how can I do it. Is there anyway I can do it in regex or in pandas. Thank you!

Possible duplicate of [Display number with leading zeros](https://stackoverflow.com/questions/134934/display-number-with-leading-zeros) — Georgy, Oct 16 '18 at 09:13

score 4 · Accepted Answer · edited Oct 16 '18 at 06:11

This is one approach using zfill

Ex:

import pandas as pd

def custZfill(val):
    val = val.split("-")
    #alternative split by last -
    #val = val.rsplit("-",1)
    val[-1] = val[-1].zfill(4)
    return "-".join(val)

df = pd.DataFrame({"EmployeeID": ["2016-CT-1028", "2016-CT-1028", 
                                  "2017-CT-1063", "2017-CT-1063", 
                                  "2015-CT-948", "2015-CT-948"]})
print(df["EmployeeID"].apply(custZfill))

Output:

0    2016-CT-1028
1    2016-CT-1028
2    2017-CT-1063
3    2017-CT-1063
4    2015-CT-0948
5    2015-CT-0948
Name: EmployeeID, dtype: object

What if 3 of all your answers is correct? Do I have to choose which one will I mark as the correct answer? — N.Omugs, Oct 16 '18 at 07:12

score 2 · Answer 2 · answered Oct 16 '18 at 06:05

2

With pandas it can be solved with split instead of regex:

df['EmployeeID'].apply(lambda x: '-'.join(x.split('-')[:-1] + [x.split('-')[-1].zfill(4)]))

answered Oct 16 '18 at 06:05

Shaido

27,497
23
70
73

Abhi · Answer 3 · 2018-10-18T10:58:28.193

2

In pandas, you could use str.replace

df['EmployeeID'] = df.EmployeeID.str.replace(r'-(\d{3})$', r'-0\1', regex=True)


# Output:

0    2016-CT-1028
1    2016-CT-1028
2    2017-CT-1063
3    2017-CT-1063
4    2015-CT-0948
5    2015-CT-0948
Name: EmployeeID, dtype: object

edited Oct 18 '18 at 10:58

answered Oct 16 '18 at 06:25

Abhi

4,068
1
16
29

score 1 · Answer 4 · answered Oct 16 '18 at 06:06

if the format of the id's is strictly defined, you can also use a simple list comprehension to do this job:

ids = [
'2017-CT-1063',
'2015-CT-948',
'2015-CT-948'
]

new_ids = [id if len(id) == 12 else id[0:8]+'0'+id[8:] for id in ids]
print(new_ids) 
# ['2017-CT-1063', '2015-CT-0948', '2015-CT-0948']

score 1 · Answer 5 · answered Oct 16 '18 at 06:10

1

Here's a one liner:

df['EmployeeID'].apply(lambda x: '-'.join(xi if i != 2 else '%04d' % int(xi) for i, xi in enumerate(x.split('-'))))

answered Oct 16 '18 at 06:10

Gerges

6,269
2
22
44

Python: Add 0/Zero in a string inside a cell

5 Answers5