2

I have a CSV file that has a column ID and Name. For example:

ID| Name   
1 | ['John Mark']

What I want to happen is to remove the [''] from the name. I tried using str.strip but it only removes the brackets.

I'm only a beginner so sorry.

prp
  • 914
  • 1
  • 9
  • 24
  • if the `Name` col has only 1 value in the list, you can use `df['Name']=df['Name'].str[0]` – anky Dec 14 '19 at 15:33
  • The Name became [ –  Dec 14 '19 at 15:39
  • okay, this means it is not a valid list? its just a string representation of a list, can you try `df['Name'].apply(ast.literal_eval).str[0]` make sure you import the ast module first by `import ast` – anky Dec 14 '19 at 15:40
  • There is still [''] –  Dec 14 '19 at 15:43
  • sure, I will open the question if you can please create a code to reconstruct the dataframe my be `df.head().to_dict()` would help – anky Dec 14 '19 at 15:45
  • 2
    @anky_91 posted a solution that works for the _one example_ you have in your question. For us to help, we may need some more examples, or a small line of code that can generate the dataframe for us – artemis Dec 14 '19 at 15:47
  • aa = df.loc[df['Id'] == Id]['Name'].values tt = str(Id)+"-"+aa –  Dec 14 '19 at 15:52
  • it's for a face recognition when matching the face to the database –  Dec 14 '19 at 15:53

2 Answers2

3

Pandas series supports string operations. For example;

data_set['Name'] = data_set['Name'].str.replace("['", "")
data_set['Name'] = data_set['Name'].str.replace("']", "")

Is it best practice? Not sure. But should work.

nerdicsapo
  • 427
  • 5
  • 16
1
data = [[1,"['John Mark']"]]
df = pd.DataFrame(data, columns = ["ID","Name"])
    ID           Name
0   1   ['John Mark']

Replace can accept a regex pattern.

>>> df["Name"].str.replace("^\['|'\]$","")
0    John Mark
Name: Name, dtype: object

In case it's not a single quote:

>>> df["Name"].str.replace("^\[.|.\]$","")
0    John Mark
Name: Name, dtype: object
vhoang
  • 1,349
  • 1
  • 8
  • 9
  • Tried df["Name"].str.replace(^\['|'\]$",""), didnt work. –  Dec 15 '19 at 05:03
  • use the first example I provided to generate the data/dataframe that doesn't work and update your question; as @wundermahn hinted at / mentioned in the comment. i took best guess jab at regenerating your data so it doesn't work is within expectations. – vhoang Dec 15 '19 at 05:25
  • ah. based on what you've responded, the next best guess of your data is that what you think is a single quote is not an actual single quote. it line up with the strip issue, which i would have constructed as str.strip("[']"). would also explain the left bracket in str[0]. also the mismatch in eval. try using a "." instead of a quote in the replace statement to match any character next to the bracket.. – vhoang Dec 15 '19 at 05:57