Pandas to_csv how to format both int and float variables with comma as thousands separator

Question

All the data I crunch needs to be reported with comma as thousands separator. I'm only interested in values with comma as thousands separator after the data is written to a .csv file. Internally within my pandas dataframe, I want to keep them as int or float.

floats

I can output the floats to zero decimal with:

 df.to_csv('c:\Awesome\Groovy.csv', float_format = '%.0f')

but whenever I try to put a comma in the float it doesn't work.

 df.to_csv('c:\Awesome\Groovy.csv', float_format = ':,.0f') ## WRONG
 df.to_csv('c:\Awesome\Groovy.csv', float_format = {:,.0f}'.format) ## WRONG

ints

And my plan for the ints, is first to convert them to float in the dataframe and then format them with the .to_csv function. Any thoughts?

As you've seen new-style format strings are not supported by pandas at the moment: https://github.com/pandas-dev/pandas/issues/2502 — mechanical_meat, Mar 28 '17 at 20:31
You're using the term 'comma separated values' wrong, you don't mean CSV, you mean 'comma as thousands separator'. — smci, Apr 30 '20 at 05:23

score 1 · Accepted Answer · edited May 23 '17 at 12:02

1

This might be too kludgy for you. At any rate here goes. Using this answer https://stackoverflow.com/a/4205875/42346 we can have our commas:

def thous(x, sep=',', dot='.'):
    num, _, frac = str(x).partition(dot)
    num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1]
    if frac:
        num += dot + frac
    return num

df['my_column'] = df['my_column'].apply(lambda x: thous(x))

Using some sample data I had lying around from another SO question:

>>> df = pd.DataFrame({'date':['2017-03-10','2017-03-11','2017-03-12'],
                       'activate_time':['2017-03-10 12:13:30','2017-03-11 13:57:49','2017-03-12 14:28:05'],
                       'mycol':[1234.56789,9876.54321,1111111.11111]})
>>> df['mycol'] = df['mycol'].apply(lambda x: thous(x))
>>> df
          activate_time        date            mycol
0   2017-03-10 12:13:30  2017-03-10      1,234.56789
1   2017-03-11 13:57:49  2017-03-11      9,876.54321
2   2017-03-12 14:28:05  2017-03-12  1,111,111.11111

edited May 23 '17 at 12:02

Community

1
1

answered Mar 28 '17 at 20:36

mechanical_meat

163,903
24
228
223

It seems to work for what I need, but I wanted it rounded so I modified : `def thous(x, sep=',', dot='.'): y = round(x) num, _, frac = str(y).partition(dot) num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1] # if frac: # num += dot + frac return num` – Tim Gottgetreu Mar 28 '17 at 22:11
then just looped through my dataframe: `for x in list(df): df[x] = df[x].apply(lambda x: thous(x))` So Thanks! – Tim Gottgetreu Mar 28 '17 at 22:12
Cheers, mate. Good on you for coming up with that modification. By the way, I'm jealous that you only work with CSVs! – mechanical_meat Mar 28 '17 at 22:14
1

The csv's are the end product, plenty of wacky stuff to start with. Thanks for your help! – Tim Gottgetreu Mar 28 '17 at 22:35

score 0 · Answer 2 · answered Mar 29 '17 at 22:15

This is the code I ended up with based on bernie's answer above. I wanted to post it here for other folks.

import re
def thous(x, sep=',', dot='.'):
   y = round(x)
   num, _, frac = str(y).partition(dot)
   num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1]
   #  comment out for no 0    if frac:
   #  comment out for no 0   num += dot + frac
   return num

num = df.select_dtypes(include=[np.number]) #selects only numeric dtypes
for x in num:
     df[x] = df[x].apply(lambda x: thous(x))

run it just before you use the to_csv command and you're golden. Cheers!

Pandas to_csv how to format both int and float variables with comma as thousands separator

floats

ints

2 Answers2