1

I have a dataframe with a column labeled Amount which is a dollar amount. For some reason, some of the cells in this column are enclosed in quotation marks (ex: "$47.25").

I'm running this for loop and was wondering what is the best approach to remove the quotes.

for f in files:
    print(f)
    df = pd.read_csv(f, header = None, nrows=1)
    print(df)
    je = df.iloc[0,1]
    df2 = pd.read_csv(f,header = 6, dtype = {'Amount':float})
    df2.to_excel(w, sheet_name = je, index = False)

I have attempted to strip the " from the value using a for loop:

for cell in df2['Amount']:
    cell = cell.strip('"')
    df2['Amount']=pd.to_numeric(df2['Amount'])

But I am getting:

ValueError: Unable to parse string "$-167.97" at position 0

Thank you in advance!

Laurent
  • 12,287
  • 7
  • 21
  • 37

1 Answers1

0

Given this toy dataframe:

import pandas as pd

df = pd.DataFrame(
    {"Transaction": ["t1", "t2"], "Amount": ["$47.25", "'$-167.97'"]}
)
print(df)
# Outputs
  Transaction      Amount
0          t1      $47.25
1          t2  '$-167.97'

Instead of using a for loop, which should generally be avoided with dataframes, you could simply remove the quotation marks from the Amountcolumn like this:

df["Amount"] = df["Amount"].str.replace("\'", "")

print(df)
# Outputs
  Transaction    Amount
0          t1    $47.25
1          t2  $-167.97
Laurent
  • 12,287
  • 7
  • 21
  • 37