0

I've used the os module to pull file names and created a DataFrame from the titles like this:

   Invoice              Vendor       Amount
0     2131           FileName1    68.00.pdf
1     2132           FileName2    68.00.pdf

How can I delete the .pdf from the amounts so I can find the sum of that column?

SHW
  • 5
  • 2

2 Answers2

2
df['Amount'] = df['Amount'].str.rstrip('.pdf')
Tom McLean
  • 5,583
  • 1
  • 11
  • 36
  • 1
    Just as a note [str.rstrip](https://pandas.pydata.org/docs/reference/api/pandas.Series.str.rstrip.html) takes a "set of characters to be removed". This command is removing any of the characters `.`, `p`, `d`, or `f` from the right side of strings. This likely _not_ the desired behaviour in most circumstances. _e.g._ the result of `pd.Series(['a.pdf', 'j.xlf', 'cap', 'q.d.f.p']).str.rstrip('.pdf')` is 'a', 'j.xl', 'ca', 'q'. – Henry Ecker Sep 02 '22 at 22:33
0
df['Amount']=df['Amount'].str.replace('.pdf','',regex=True)
df
Invoice     Vendor  Amount
0   2131    FileName1   68.00
1   2132    FileName2   68.00
Naveed
  • 11,495
  • 2
  • 14
  • 21