0

I am using python pandas and here is the dataframe

sl.no data_col1
321 abc-1
324 abc-2
326 abc-3
328 abc-4
330 abc-5
330 abc-12
331 xyz-1

Want to replace abc-single digit with abc-01, abc-02, abc-03, data other than start with abc should remains same

sl.no data_col1
321 abc-01
324 abc-02
326 abc-03
328 abc-04
330 abc-05
330 abc-12
331 xyz-1

I am new to python need some inputs using df.replace() or any short method

Kavya shree
  • 312
  • 1
  • 7
  • 24
  • Maybe your answer not this question? ( how to use regex and replace ) [link](https://stackoverflow.com/questions/22588316/pandas-applying-regex-to-replace-values) – F.S Jul 19 '23 at 05:47

1 Answers1

1

You can use Series.str.replace with a backreference to the capturing group and have it be preceded by 0 (see also: re.sub):

data = {'sl.no': {0: 321, 1: 324, 2: 326, 3: 328, 4: 330, 5: 330, 6: 331}, 
        'data_col1': {0: 'abc-1', 1: 'abc-2', 2: 'abc-3', 3: 'abc-4', 
                      4: 'abc-5', 5: 'abc-12', 6: 'xyz-1'}}
df = pd.DataFrame(data)

df['data_col1'] = df['data_col1'].str.replace(r'(?<=abc-)(\d)$',r'0\1', regex=True)

df

   sl.no data_col1
0    321    abc-01
1    324    abc-02
2    326    abc-03
3    328    abc-04
4    330    abc-05
5    330    abc-12
6    331     xyz-1

For an explanation of the patterns, see here.

Alternatively, as mentioned by @mozway in the comments, you could pass a function to repl and apply str.zfill:

df['data_col1'] = df['data_col1'].str.replace(r'(?<=abc-)(\d+)$', 
                                              lambda x: x.group().zfill(2), 
                                              regex=True)

A more labored alternative: use Series.str.split with Series.str.zfill and do something like this:

tmp = df['data_col1'].str.split('-', expand=True)
df['data_col1'] = tmp[0] + '-' + np.where(tmp[0].eq('abc'), tmp[1].str.zfill(2), tmp[1])
ouroboros1
  • 9,113
  • 3
  • 7
  • 26
  • Both didnt work for the first solution instead of abc-01 output is just 01 – Kavya shree Jul 19 '23 at 06:07
  • Updated the answer after the update of your Q. Initially, you were showing *only* `abc-` like values, that makes a difference. If it still doesn't work on your set, could you add your original `df` in the form of a dict using `df.to_dict()`? – ouroboros1 Jul 19 '23 at 06:17
  • 1
    you could also apply `zfill` in `str.replace` with a function as replacement ;) – mozway Jul 19 '23 at 06:32