0

I am trying to replace [] values with '', strip the spaces, and return a clean Dataframe

                   Distance             TGR Grade                 TGR1
0   [342m, 342m, 530m, 342m]         [M, M, RW, RW]            [1, 1, 7, 1]
1   [390m, 390m, 390m, 390m,450]    [M, 7, 6G, X45, X67]       [1, 2, 4, 5, 5]

I have applied several functions but the Dataframe is either returns nan value or it returns the same Dataframe

To from both sides of the values

df[df.columns]=df[df.columns].apply(lambda x:x.str.strip())
df[cols]=df[cols].astype(str).agg(lambda x:x.str.strip("frozenset({''})"),1)
df.replace('\[', '', regex=True)
df.replace('\]', '', regex=True)

but the df still remains the same

chuky pedro
  • 756
  • 1
  • 8
  • 26
  • 1
    Is that exactly how you're calling `replace`? Because you [need to assign back](https://stackoverflow.com/a/37593583/15497888) – Henry Ecker Oct 02 '21 at 03:19
  • Are you sure the cells actually contain '[ ]'? Or are those lists, and pandas is merely PRINTING them with '[ ]'? – Tim Roberts Oct 02 '21 at 03:28
  • if you try using string operations on LISTS, the result is NaN. – cs95 Oct 02 '21 at 04:04
  • Your question is to remove all square brackets and spaces in the column names and data ? I've posted an answer below. Kindly accept the answer by checking it if it answers your question. – EBDS Oct 02 '21 at 04:33
  • @TimRoberts they are all list, but `pd.info()` return objects for the values – chuky pedro Oct 02 '21 at 07:05
  • Right. That's the key point people seem to be missing. THESE AREN'T STRINGS. String operations aren't going to work. His cells do not contain brackets or commas. His cells contain Python lists. You need to tell us what you want the output to look like -- something you haven't yet done. – Tim Roberts Oct 03 '21 at 00:32

1 Answers1

0

Does this answer your question ?

df.columns = [''.join(i.split()) for i in df.columns]
df.applymap(lambda x: ''.join(x.strip('\[').strip('\]').split()))

Output:

    Distance                   TGRGrade         TGR1
0   342m,342m,530m,342m        M,M,RW,RW        1,1,7,1
1   390m,390m,390m,390m,450    M,7,6G,X45,X67   1,2,4,5,5

Updated with full code:

from io import StringIO
import pandas as pd

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity='all'

d = '''
"Distance","TGR Grade","TGR1"
"[342m, 342m, 530m, 342m]","[M, M, RW, RW]","[1, 1, 7, 1]"
"[390m, 390m, 390m, 390m,450]","[M, 7, 6G, X45, X67]","[1, 2, 4, 5, 5]"
'''
df = pd.read_csv(StringIO(d))
df
df.columns = [''.join(i.split()) for i in df.columns]
df = df.applymap(lambda x: ''.join(x.strip('\[').strip('\]').split()))
df

Output:

    Distance                     TGR Grade            TGR1
0   [342m, 342m, 530m, 342m]     [M, M, RW, RW]       [1, 1, 7, 1]
1   [390m, 390m, 390m, 390m,450] [M, 7, 6G, X45, X67] [1, 2, 4, 5, 5]

    Distance                     TGRGrade             TGR1
0   342m,342m,530m,342m          M,M,RW,RW            1,1,7,1
1   390m,390m,390m,390m,450      M,7,6G,X45,X67       1,2,4,5,5
EBDS
  • 1,244
  • 5
  • 16
  • I am getting the below error `AttributeError: 'list' object has no attribute 'strip'` – chuky pedro Oct 02 '21 at 06:42
  • @ChukypedroOkolie I've updated with full codes and full output. Can you compare with your code ? If it still happen, can you tell which line is giving the problem ? – EBDS Oct 02 '21 at 07:42
  • I think the values are in a column of list string. I need to remove the values from the list. I am still having the same error – chuky pedro Oct 02 '21 at 08:00
  • is the above output what you wanted ? cause that's what I understood from your question. If the output is not what you wanted, then have to show what output you want. – EBDS Oct 02 '21 at 08:12
  • He has not told us what he wanted. The key point you missed is that his cells do not contain strings. String operations aren't going to work. His cells contain Python lists. – Tim Roberts Oct 03 '21 at 00:33