1

My goal is to state a list of columns by name that I want to apply the formatting to.

The section that is commented out is ideally what I am after (which would mean removing the line of code directly above what is commented), but I get the following error:

'DataFrame' object has no attribute 'map'` error.

Is there a way/better way to achieve what I am attempting?

import pandas as pd

cols = ['Spend', 'Sales']
stuff=[[3, 2],[5, 6]]
df = pd.DataFrame(stuff, columns = cols)


df.loc[:, 'Spend'] ='$'+ df['Spend'].map('{:,.0f}'.format)

# list1=['Spend', 'Sales']
# df.loc[:, list1] ='$'+ df[list1].map('{:,.0f}'.format)

print(df)
ah2Bwise
  • 82
  • 2
  • 17

3 Answers3

4

Your code almost gets there, you just need to call map inside a call to apply (which will do it for each column):

list1 = ['Spend', 'Sales']
df.loc[:, list1] = '$'+ df[list1].apply(lambda col: col.map('{:,.0f}'.format))

Output:

>>> df
  Spend Sales
0    $3    $2
1    $5    $6
  • Just curious, is there a method to do this with vecotrization? – ah2Bwise Mar 25 '22 at 21:56
  • No. However, if you're thinking that `apply` is slow (as they always say), you're right that it is, **but**, since it's being used for each *column* rather than each *row*, it's not really a concern here. Using `apply` for each column is generally considered fine. –  Mar 25 '22 at 21:59
  • @anarchocaps `apply` on `axis=0` is actually very fast. For a DataFrame of shape `(200000, 2)`, it's the fastest option given here; it's ~7% faster than @@bertwassink's answer and ~15% faster than my answer –  Mar 25 '22 at 22:08
  • 1
    Or use a list comprehension. There is no vectorization per se for strings (maybe with arrow strings), so list comprehension should be fast enough – sammywemmy Mar 26 '22 at 00:27
2

Are you looking for something along the lines of:

df[list1] =['$']*2 + df[list1].astype(str)

or (as suggested by @richardec):

df[list1] = '$' + df[list1].astype(str)

Output:

  Spend Sales
0    $3    $2
1    $5    $6
  • 1
    `['$']*2` is not *necessary*, you can just use `$` –  Mar 25 '22 at 21:51
  • This is nice, but is there also a way to make it work with something like df.loc[:, list1'] = df[list1].map('{:,.0f}'.format) ? the formatting I am doing does not always simply add a character on the front or back. ty – ah2Bwise Mar 25 '22 at 21:55
2

From this thread How to display pandas DataFrame of floats using a format string for columns? we can learn

cols = ['Spend', 'Sales']
stuff=[[3, 2],[5, 6]]
df = pd.DataFrame(stuff, columns = cols)
list1 = ['Spend', 'Sales']
df[list1] = df[list1].applymap('${:,.2f}'.format)

output

df
   Spend  Sales
0  $3.00  $2.00
1  $5.00  $6.00
bert wassink
  • 350
  • 3
  • 9
  • Using `applymap` here is cool! However, it's not as vectized as would be using `apply`-columnwise + `map`, as in my answer. +1 soon anyway –  Mar 25 '22 at 21:59
  • It is indeed not optimal, but you might want to include the dollar sign as a function argument if you want to pipe several functions. – bert wassink Mar 25 '22 at 22:12
  • 1
    @richardec, _'it's not as vectized as would be using apply'_, maybe it's not as vectorized but definetly faster - 2.07 ms ± 52.3 µs against 2.6 ms ± 59.3 µs. Anyway [@enke](https://stackoverflow.com/a/71623482/18344512) solution even more faster - 1.11 ms ± 19.3 µs – SergFSM Mar 26 '22 at 06:11
  • @SergFSM interesting! Wouldn't have guessed that. –  Mar 26 '22 at 14:42
  • It is probably because it is a small example. I guess there will be turning point after which the vectorized solution will be perform better. This is usually the case for vectorized solutions. – bert wassink Mar 26 '22 at 16:21