0

I want to convert a numeric column month into a string, and the month column into two characters. Add 0 before it is not enough. What kind of sentence is used?

import numpy as np
import pandas as pd
df=pd.DataFrame(np.arange(1,13),columns=['month'])
print(df)
    month
0       1
1       2
2       3
3       4
4       5
5       6
6       7
7       8
8       9
9      10
10     11
11     12

What I want to achieve:

    month
0      01
1      02
2      03
3      04
4      05
5      06
6      07
7      08
8      09
9      10
10     11
11     12

Lambda is the fastest, and I think zfill is simple.

%timeit result = df['month'].apply(lambda x: f'{x:02}')
397 µs ± 8.08 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit df['month']=df['month'].astype(str).str.rjust(2,'0')
764 µs ± 6.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit df['month'] = df['month'].astype(str).str.zfill(2)
852 µs ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
m=df['month'].isin(list(range(10)))
df['month']=df['month'].astype(str)
df['month']=np.where(m,'0'+df['month'],df['month'])
1.2 ms ± 18.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
m=df['month'].isin(list(range(10)))
df.loc[m,'month']='0'+df.loc[m,'month'].astype(str)
1.65 ms ± 13.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
jaried
  • 632
  • 4
  • 15

2 Answers2

3

Use Series.str.zfill to

Pad strings in the Series/Index by prepending ‘0’ characters ... Strings in the Series/Index are padded with ‘0’ characters on the left of the string to reach a total string length width.

df['month'] = df['month'].astype(str).str.zfill(2)
ifly6
  • 5,003
  • 2
  • 24
  • 47
1

You could specify the integer format when converting to a string. This could either be done by using .format() or applying f-strings. Using f-strings looks like this:

import numpy as np
import pandas as pd

data = np.arange(1,13)

df = pd.DataFrame(data, columns=['month'])
print(df)

result = df['month'].apply(lambda x: f'{x:02}')
print(result)
albert
  • 8,027
  • 10
  • 48
  • 84