Formatting Multiple Columns by Name using .loc

Question

My goal is to state a list of columns by name that I want to apply the formatting to.

The section that is commented out is ideally what I am after (which would mean removing the line of code directly above what is commented), but I get the following error:

'DataFrame' object has no attribute 'map'` error.

Is there a way/better way to achieve what I am attempting?

import pandas as pd

cols = ['Spend', 'Sales']
stuff=[[3, 2],[5, 6]]
df = pd.DataFrame(stuff, columns = cols)


df.loc[:, 'Spend'] ='$'+ df['Spend'].map('{:,.0f}'.format)

# list1=['Spend', 'Sales']
# df.loc[:, list1] ='$'+ df[list1].map('{:,.0f}'.format)

print(df)

score 4 · Accepted Answer · answered Mar 25 '22 at 21:49

4

Your code almost gets there, you just need to call map inside a call to apply (which will do it for each column):

list1 = ['Spend', 'Sales']
df.loc[:, list1] = '$'+ df[list1].apply(lambda col: col.map('{:,.0f}'.format))

Output:

>>> df
  Spend Sales
0    $3    $2
1    $5    $6

answered Mar 25 '22 at 21:49

Just curious, is there a method to do this with vecotrization? – ah2Bwise Mar 25 '22 at 21:56
No. However, if you're thinking that `apply` is slow (as they always say), you're right that it is, **but**, since it's being used for each *column* rather than each *row*, it's not really a concern here. Using `apply` for each column is generally considered fine. – Mar 25 '22 at 21:59
@anarchocaps `apply` on `axis=0` is actually very fast. For a DataFrame of shape `(200000, 2)`, it's the fastest option given here; it's ~7% faster than @@bertwassink's answer and ~15% faster than my answer – Mar 25 '22 at 22:08
1

Or use a list comprehension. There is no vectorization per se for strings (maybe with arrow strings), so list comprehension should be fast enough – sammywemmy Mar 26 '22 at 00:27

score 2 · Answer 2 · 2022-03-25T21:53:24.613

2

Are you looking for something along the lines of:

df[list1] =['$']*2 + df[list1].astype(str)

or (as suggested by @richardec):

df[list1] = '$' + df[list1].astype(str)

Output:

  Spend Sales
0    $3    $2
1    $5    $6

edited Mar 25 '22 at 21:53

answered Mar 25 '22 at 21:50

1

`['$']*2` is not *necessary*, you can just use `$` – Mar 25 '22 at 21:51
This is nice, but is there also a way to make it work with something like df.loc[:, list1'] = df[list1].map('{:,.0f}'.format) ? the formatting I am doing does not always simply add a character on the front or back. ty – ah2Bwise Mar 25 '22 at 21:55

bert wassink · Answer 3 · 2022-03-25T21:59:58.747

2

From this thread How to display pandas DataFrame of floats using a format string for columns? we can learn

cols = ['Spend', 'Sales']
stuff=[[3, 2],[5, 6]]
df = pd.DataFrame(stuff, columns = cols)
list1 = ['Spend', 'Sales']
df[list1] = df[list1].applymap('${:,.2f}'.format)

output

df
   Spend  Sales
0  $3.00  $2.00
1  $5.00  $6.00

edited Mar 25 '22 at 21:59

answered Mar 25 '22 at 21:57

bert wassink

350
3
9

Using `applymap` here is cool! However, it's not as vectized as would be using `apply`-columnwise + `map`, as in my answer. +1 soon anyway – Mar 25 '22 at 21:59
It is indeed not optimal, but you might want to include the dollar sign as a function argument if you want to pipe several functions. – bert wassink Mar 25 '22 at 22:12
1

@richardec, _'it's not as vectized as would be using apply'_, maybe it's not as vectorized but definetly faster - 2.07 ms ± 52.3 µs against 2.6 ms ± 59.3 µs. Anyway [@enke](https://stackoverflow.com/a/71623482/18344512) solution even more faster - 1.11 ms ± 19.3 µs – SergFSM Mar 26 '22 at 06:11
@SergFSM interesting! Wouldn't have guessed that. – Mar 26 '22 at 14:42
It is probably because it is a small example. I guess there will be turning point after which the vectorized solution will be perform better. This is usually the case for vectorized solutions. – bert wassink Mar 26 '22 at 16:21

Formatting Multiple Columns by Name using .loc

3 Answers3