0

I have a question on manipulate dataframe. in my case, the dataframe has a column where there are numbers staring from 1 to 1999.

I want to do the following actions:

  1. to add zeros before the numbers to make them a 6-digit code, for example, 0000001,000002,...001999
  2. to add a suffix to the 6-digit code, for example, 000001xx,000002xx,...001999xx

how can I do?

Kopi-C
  • 43
  • 3

3 Answers3

2
In [93]: df = pd.DataFrame({"num":range(1, 2000)})
In [94]: df
Out[94]:
       num
0        1
1        2
2        3
3        4
4        5
...    ...
1994  1995
1995  1996
1996  1997
1997  1998
1998  1999

[1999 rows x 1 columns]
In [97]: df["new_num"] = df["num"].map("{0:0=6d}".format)
In [98]: df["new_num"] = df["new_num"] + "xx"

In [99]: df
Out[99]:
       num   new_num
0        1  000001xx
1        2  000002xx
2        3  000003xx
3        4  000004xx
4        5  000005xx
...    ...       ...
1994  1995  001995xx
1995  1996  001996xx
1996  1997  001997xx
1997  1998  001998xx
1998  1999  001999xx

[1999 rows x 2 columns]

You can combine the above 2 steps to one

df["num"].map("{0:0=6d}xxx".format)
bigbounty
  • 16,526
  • 5
  • 37
  • 65
1

You can create a string from your number by applying a lambda (or map see bigbounty's answer ) to calculate a formatted string column:

import pandas as pd


df = pd.DataFrame(({ "nums": range(100,201)}))

# format the string in one go
df["modded"] = df["nums"].apply(lambda x:f"{x:06n}xxx")
print(df)

Output:

     nums     modded
0     100  000100xxx
1     101  000101xxx
2     102  000102xxx
..    ...        ...
98    198  000198xxx
99    199  000199xxx
100   200  000200xxx
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • It is amazing. I don't understand the expression in the lambda, but it really work well. – Kopi-C Jul 26 '20 at 15:10
  • @kopi-C see [using-pythons-format-specification-mini-language-to-align-floats](https://stackoverflow.com/questions/9549084/using-pythons-format-specification-mini-language-to-align-floats) or directly here: https://docs.python.org/3/library/string.html#formatspec – Patrick Artner Jul 26 '20 at 22:01
0

Just use str.rjust

import pandas as pd

df = pd.DataFrame({"num": range(1, 2000)})

print(df.num.astype(str).str.rjust(6, '0') + "xx")

0       000001xx
1       000002xx
2       000003xx
3       000004xx
4       000005xx
          ...   
1994    001995xx
1995    001996xx
1996    001997xx
1997    001998xx
1998    001999xx
sushanth
  • 8,275
  • 3
  • 17
  • 28