2

I have a data frame called df that looks something like this:

pd.DataFrame({
    'column1' : ['client#1 is #name#', 'client#2 is #name#'], 
    'column2': ['josh', 'max']}
)

              column1 column2
0  client#1 is #name#    josh
1  client#2 is #name#     max

I am trying to replace the phrase "#name" in column1 with the value of column2. I want the end result to look like this:

enter image description here

I have tried a few approaches like the following:

df['column1'] = df['column1'].replace(["#name#"], df['column2'])

But I am not sure of how to grab the specific phrase '#name#' in column1 and replace it with the value of column2. Any suggestions on how to approach this would be greatly appreciated!

cs95
  • 379,657
  • 97
  • 704
  • 746
user3116949
  • 265
  • 1
  • 5
  • 14

1 Answers1

2

If it's strings, and if there are no NaNs, I would recommend calling str.replace inside a list comprehension for speed:

df['column1'] = [
    x.replace('#name#', y) for x, y in zip(df.column1, df.column2)]

df
            column1 column2
0  client#1 is josh    josh
1   client#2 is max     max

Why are list comprehensions worth it for string operations? You can read more at For loops with pandas - When should I care?.


Another interesting option you can consider is str.replace with iter:

it = iter(df.column2)
df['column1'] = df.column1.str.replace('#name#', lambda x: next(it))

df
            column1 column2
0  client#1 is josh    josh
1   client#2 is max     max

Should handle NaNs and mixed dtypes just fine (but will be slower).


A simpler replace option by @Vaishali, which will work if the "#name#" substring is always at the end of the string.

df['column1'] = df.column1.add(df.column2).str.replace('#name#', '')
df
            column1 column2
0  client#1 is josh    josh
1   client#2 is max     max
cs95
  • 379,657
  • 97
  • 704
  • 746
  • 1
    Thanks @coldspeed this really helps and clarifies what I am trying to do! – user3116949 Jan 11 '19 at 17:19
  • @user3116949 Word of advice, try to make sure your questions conform to the site guidelines. This means all data should be present as runnable code in your question. – cs95 Jan 11 '19 at 17:21
  • 1
    Or simply df.column1.add(df.column2).str.replace('#name#', '') :) – Vaishali Jan 11 '19 at 17:23
  • 1
    @Vaishali Ah that is a nice one and will work, assuming "#name#" is always at the end of the string. – cs95 Jan 11 '19 at 17:24