replace text string in entire column after first occurance

Question

I'm trying to replace all but the first occurrence of a text string in an entire column. My specific case is replacing underscores with periods in data that looks like client_19_Aug_21_22_2022 and I need this to be client_19.Aug.21.22.2022

if I use [1], I get this error: string index out of range
but [:1] does all occurrences (it doesn't skip the first one)
[1:] inserts . after every character but doesn't find _ and replace 

df1['Client'] = df1['Client'].str.replace('_'[:1],'.')

It's better not to mark text as code and it's better to provide a `df` example to simplify debugging. See https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — Александр М, Aug 29 '22 at 19:20

score 0 · Answer 1 · answered Aug 29 '22 at 19:33

Not the simplest, but solution:

import re
df.str.apply(lambda s: re.sub(r'^(.*?)\.', r'\1_', s.replace('_', '.')))

Here in the lambda function we firstly replace all _ with .. Then we replace the first occurrence of . back with _. And finally, we apply lambda to each value in a column.

score 0 · Answer 2 · answered Aug 29 '22 at 19:35

Pandas Series have a .map method that you can use to apply an arbitrary function to every row in the Series.

In your case you can write your own replace_underscores_except_first function, looking something like:

def replace_underscores_except_first(s):
    newstring = ''
    # Some logic here to handle replacing all but first.
    # You probably want a for loop with some conditional checking
    return newstring

and then pass that to .map like:

df1['Client'] = df1['Client'].map(replace_underscores_except_first)

score 0 · Answer 3 · answered Aug 29 '22 at 21:17

An example using map, and in the function check if the string contain an underscore. If it does, split on it, and join back all parts except the first with a dot.

import pandas as pd

items = [
    "client_19_Aug_21_22_2022",
    "client123"
]


def replace_underscore_with_dot_except_first(s):
    if "_" in s:
        parts = s.split("_")
        return f"{parts[0]}_{'.'.join(parts[1:])}"
    return s


df1 = pd.DataFrame(items, columns=["Client"])

df1['Client'] = df1['Client'].map(replace_underscore_with_dot_except_first)
print(df1)

Output

                     Client
0  client_19.Aug.21.22.2022
1                 client123

replace text string in entire column after first occurance

3 Answers3