-1

my first time posting- please be kind. I have truly searched and searched for the right answer and also tried all options I could find. googling the errors to try and tweak the code. I have already looked at escaping the characters.

I have a pandas dataframe with a column called Page. this is a list of webpage names (not urls).

so currently they are written in 3 formats:

  1. home ? home ? pagename1
  2. home | home | pagename2
  3. home home pagename3

I would like them all to be formatted like number 3.

I am trying to remove characters from the string objects in this column but leave the remainder of the code.

I have used this:

df.loc[df['Page'].str.replace(('\?|\|'), ''), Regex=True, Inplace=True]

but I get output:

File "<ipython-input-80-2c616b171200>", line 2
df['page']=df.loc[df['Page'].str.replace(('\?|\\'), ''), Regex=True, Inplace=True]
SyntaxError: invalid syntax

same output if I use this:

df['page']=df.loc[df['Page'].str.replace(('\?|\|'), ''), Regex=True, Inplace=True]

I've resorted to try other options such as:

x=pd.Series['Page']
x.str.replace('\?|\|','',regex = True, inplace=True)

but this gave me: TypeError Traceback (most recent call last) <ipython-input-70-6563d5fa5d40> in <module> 1 #clean up page names ----> 2 x=pd.Series['Page'] 3 x.str.replace('\?|\|','',regex = True, inplace=True) TypeError: 'type' object is not subscriptable

please can anyone help?

thank you

Mizz

AMC
  • 2,642
  • 7
  • 13
  • 35
Mizz H
  • 67
  • 6
  • In the documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.replace.html the method str.replace does not have the inplace argument. It returns a modified copy of the Series. Removing that argument solved the problem. – GusSL Sep 26 '20 at 22:12

1 Answers1

-1

Nearly there. Just try this -- I tweaked your code a little.

df['page'] = df['page'].str.replace(r'\?|\\', '', regex=True)
``
mullinscr
  • 1,668
  • 1
  • 6
  • 14