1

I am trying to replace a value inside a string column which is between two specific wording

For example, from this dataframe I want to change

 df

 seller_name    url
 Lucas          http://sanyo.mapi/s3/e42390aac371?item_title=Branded%20boys%20Clothing&seller_name=102392852&buyer_item=106822419_1056424990 

To this

url
http://sanyo.mapi/s3/e42390aac371?item_title=Branded%20boys%20Clothing&seller_name=Lucas&buyer_item=106822419_1056424990 

Look in the URL in the seller_name= part I replaced by the real name, I changed the numbers for the real name.

I imagine something like changing from seller_name= to the first and that it see from seller_name.

this is just an example of what i want to do but really i have many of rows in my dataframe and length of the numbers inside the seller name is not always the same

galfisher
  • 1,122
  • 1
  • 13
  • 24
Lucas Dresl
  • 1,150
  • 1
  • 10
  • 19
  • Can't you split the string by `&` into a list, then take list element where the `seller_name` is to replace the seller code by seller name and concatenate the list back into a string? You could also substitute using regular expressions https://stackoverflow.com/questions/11475885/python-replace-regex – Nerdrigo Sep 25 '18 at 18:34

4 Answers4

1

Use apply and replace the string with seller name

Sample df

import pandas as pd
df=pd.DataFrame({'seller_name':['Lucas'],'url':['http://sanyo.mapi/s3/e42390aac371?item_title=Branded%20boys%20Clothing&seller_name=102392852&buyer_item=106822419_1056424990']})

import re
def myfunc(row):
    return(re.sub('(seller_name=\d{1,})','seller_name='+row.seller_name,row.url))
df['url']=df.apply(lambda x: myfunc(x),axis=1)
mad_
  • 8,121
  • 2
  • 25
  • 40
0
seller_name = 'Lucas'
url = 'http://sanyo.mapi/s3/e42390aac371?item_title=Branded%20boys%20Clothing&seller_name=102392852&buyer_item=106822419_1056424990'
a = url.index('seller_name=')
b = url.index('&', a)
out = url.replace(url[a+12:b],seller_name)
print(out)

Try This one:

Rohit-Pandey
  • 2,039
  • 17
  • 24
0

This solution doesn't assume the order of your query parameters, or the length of the ID you're replacing. All it assumes is that your query is &-delimited, and that you have the seller_name parameter, present.

split_by_amps = url.split('&')
for i in range(len(split_by_amps)):
    if (split_by_amps[i].startswith('seller_name')):
        split_by_amps[i] += 'seller_name=' + 'Lucas'
        break

result = '&'.join(split_by_amps)
Woody1193
  • 7,252
  • 5
  • 40
  • 90
0

You can use regular expressions to substitute the code for the name:

import pandas as pd
import re

#For example use a dictionary to map codes to names
seller_dic = {102392852:'Lucas'}

for i in range(len(df['url'])):
    #very careful with this, if a url doesn't have this structure it will throw
    #an error, you may want to handle exceptions
    code = re.search(r'seller_name=\d+&',df['url'][i]).group(0)
    code = code.replace("seller_name=","")
    code = code.replace("&","")

    name = 'seller_name=' + seller_dic[code] + '&'

    url = re.sub(r'seller_name=\d+&', name, df['url'][i])

    df['url'][i] = url
Nerdrigo
  • 308
  • 1
  • 5
  • 14