4

I have Following as adtaset in dataframe format , i need to remove the square brackets From the data. How can we proceed can anyone help

   From             TO
   [wrestle]        engage in a wrestling match
   [write]          communicate or express by writing
   [write]          publish
   [spell]          write
   [compose]        write music

Expected output is:

   From             TO
   wrestle      engage in a wrestling match
   write       communicate or express by writing
   write       publish
   spell       write
user3483203
  • 50,081
  • 9
  • 65
  • 94
Cyley Simon
  • 253
  • 2
  • 5
  • 17

2 Answers2

4

Suppose you have this dataframe:

df = pd.DataFrame({'Region':['New York','Los Angeles','Chicago'], 'State': ['NY [new york]', '[California]', 'IL']})

Which will be like this:

        Region          State
0     New York  NY [new york]
1  Los Angeles   [California]
2      Chicago             IL

To just remove the square brackets you need the following lines:

df['State'] = df['State'].str.replace(r"\[","")
df['State'] = df['State'].str.replace(r"\]","")

The result:

        Region        State
0     New York  NY new york
1  Los Angeles   California
2      Chicago           IL

If you want to remove square bracket with every thing between them:

df['State'] = df['State'].str.replace(r"\[.*\]","")
df['State'] = df['State'].str.replace(r" \[.*\]","")

The first line just deletes the characters between square brackets, the second line considers the space before character, so to make sure you are doing it safe it's better to run both of these lines.

By applying these two lines on the original df:

        Region State
0     New York    NY
1  Los Angeles      
2      Chicago    IL
user_5
  • 498
  • 1
  • 5
  • 22
3

Use str.strip if strings:

print (type(df.loc[0, 'From']))
<class 'str'>

df['From'] = df['From'].str.strip('[]')

... and if lists convert them by str.join:

print (type(df.loc[0, 'From']))
<class 'list'>

df['From'] = df['From'].str.join(', ')

Thank you @juanpa.arrivillaga for suggestion if one item lists:

df['From'] = df['From'].str[0]

what is possible check by:

print (type(df.loc[0, 'From']))
<class 'list'>

print (df['From'].str.len().eq(1).all())
True

print (df)
      From                                 TO
0  wrestle        engage in a wrestling match
1    write  communicate or express by writing
2    write                            publish
3    spell                              write
4  compose                        write music
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252