How to remove square brackets from dataframe

Question

I have seen many links related to my question:

How to remove extraneous square brackets from a nested list inside a dictionary?

but none of that worked

below is my example:

df1

column1    column2   column3    ..... upto 'n' number of columns

[data1]    data1     data1
NAN        data2     data2
[data2]    data3     [data3, data3, testing how are you guys hope you guys are doing :)]
[data3]    data3     [data4, dummy text to test to test test test] 
NAN        data4     [data5]

below is my tried code:

df1[column1] = df[column1].str[0]
# not working !
# want to give df1 instead of df1[columns] because there are lot of 
# columns

i want to remove only the bracket, not anything else and want to give only dataframe not along with columns because there are lot of columns !

expected output:

column1    column2   column3    ..... upto 'n' number of columns

data1      data1     data1
NAN        data2     data2
data2      data3     data3, data3, testing how are you guys hope you guys are doing :)
data3      data3     data4, dummy text to test to test test test
NAN        data4     data5

`df1['column1'].str.replace(r'[\[\]]', '')` try `str.replace` — Epsi95, Aug 04 '21 at 17:37
no, i want to give df1 not df1[columns] because there are n no of columns — Titan, Aug 04 '21 at 17:50
These are lists, and you searched for string related solutions. Regex only works on strings, not objects. — Wiktor Stribiżew, Aug 04 '21 at 21:20

score 1 · Accepted Answer · answered Aug 04 '21 at 17:53

Try with apply, explode and groupby:

>>> df.apply(lambda x: x.explode().astype(str).groupby(level=0).agg(", ".join))
  column1 column2                                            column3
0   data1   data1                                              data1
1     nan   data2                                              data2
2   data2   data3  data3, data3, testing how are you guys hope yo...
3   data3   data3        data4, dummy text to test to test test test
4     nan   data4                                              data5

Use pandas.explode() to transform each list element to its own row, replicating index values.
Then groupby identical index values and aggregate using str.join().
Use apply to apply the same function to all columns of the DataFrame.

Simon Nasser · Answer 2 · 2021-08-04T17:47:21.927

0

for i in range(0, df.shape[0]):
       df1['column1'][i] = str(df['column1'][i]).strip('[]')

I didn't test this with an example dataframe, but with my experience with pandas it should work.

Edit: this tested code works

import pandas as pd

df = pd.DataFrame({'column': ['test', '[test]']})
df1 = pd.DataFrame({'column1': ['a', 'b']})
for i in range(0, df.shape[0]):
       df1['column1'][i] = str(df['column'][i]).strip('[]')

edited Aug 04 '21 at 17:47

answered Aug 04 '21 at 17:39

Simon Nasser

112
12

One shorter version that could be applied to the entire `DataFrame` is `df1.applymap(lambda x: x.strip('[]'))`. – luizbarcelos Aug 04 '21 at 17:46
but it throws error AttributeError: 'list' object has no attribute 'strip' while doing df1.applymap(lambda x: x.strip('[]')) – Titan Aug 04 '21 at 17:49

How to remove square brackets from dataframe

2 Answers2