Format / Suppress Scientific Notation from Pandas Aggregation Results

Question

How can one modify the format for the output from a groupby operation in pandas that produces scientific notation for very large numbers?

I know how to do string formatting in python but I'm at a loss when it comes to applying it here.

df1.groupby('dept')['data1'].sum()

dept
value1       1.192433e+08
value2       1.293066e+08
value3       1.077142e+08

This suppresses the scientific notation if I convert to string but now I'm just wondering how to string format and add decimals.

sum_sales_dept.astype(str)

possible duplicate of [Suppressing scientific notation in pandas?](http://stackoverflow.com/questions/17737300/suppressing-scientific-notation-in-pandas) — Dan Allan, Jan 15 '14 at 13:13
I saw that question but I'm not sure how that helps me. I'm just looking to preserve the current dtype which is float and simply show all decimals in the result instead of scientific notation. — horatio1701d, Jan 15 '14 at 13:52
That is probably just a display thing. But if you think there's something particular about your problem makes yours different from the one in Dan's link then you need to post more information about your problem, preferably with a small dataset that reproduces the problem. Also what are the `dtypes` on your result? — TomAugspurger, Jan 15 '14 at 14:18
Although technically *not* a duplicate of this question, [another question's answer](https://stackoverflow.com/a/52405666/1048186) completes both the goal of this question **and adds commas between the thousands**, all with fewer characters! — Josiah Yoder, Jul 20 '23 at 17:07
While currently [Suppressing scientific notation in pandas?](https://stackoverflow.com/questions/17737300/suppressing-scientific-notation-in-pandas), is marked as a duplicate of this question, its top answer is better than this question because it **also adds the commas between the thousands**. Therefore, I support [the suggestion above that the duplicate mark be reversed](https://stackoverflow.com/questions/21137150/format-suppress-scientific-notation-from-pandas-aggregation-results#comment31811285_21137150). — Josiah Yoder, Jul 20 '23 at 17:09

score 384 · Accepted Answer · answered Jan 15 '14 at 14:40

384

Granted, the answer I linked in the comments is not very helpful. You can specify your own string converter like so.

In [25]: pd.set_option('display.float_format', lambda x: '%.3f' % x)

In [28]: Series(np.random.randn(3))*1000000000
Out[28]: 
0    -757322420.605
1   -1436160588.997
2   -1235116117.064
dtype: float64

I'm not sure if that's the preferred way to do this, but it works.

Converting numbers to strings purely for aesthetic purposes seems like a bad idea, but if you have a good reason, this is one way:

In [6]: Series(np.random.randn(3)).apply(lambda x: '%.3f' % x)
Out[6]: 
0     0.026
1    -0.482
2    -0.694
dtype: object

answered Jan 15 '14 at 14:40

Dan Allan

34,073
6
70
63

1

Thanks Dan. Do you know how to reset pandas options? – Josh Jan 23 '17 at 19:04
4

@Josh To temporarily set options in pandas, you can use `pandas.option_context` (see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.option_context.html). – muellermarkus Jul 27 '18 at 10:41
2

It's oftentimes not for aesthetic purposes, but for quicker skimming of information via the visual cortex over large numeric dataframes. – matanster Apr 20 '19 at 18:57
pd.set_option('display.float_format', lambda x: '%.3f' % x) worked for me too – driven_spider May 28 '19 at 17:19
13

This works and you can also use the newer f-string notation. Like `pd.set_option('display.float_format', lambda x: f'{x:,.3f}')` if you want a thousand separator as well. – 576i Oct 29 '19 at 10:26
or: `pd.set_option('float_format', '{:.2f}'.format)` – Muhammad Yasirroni Sep 15 '22 at 04:24

score 176 · Answer 2 · edited Oct 12 '18 at 15:03

176

Here is another way of doing it, similar to Dan Allan's answer but without the lambda function:

>>> pd.options.display.float_format = '{:.2f}'.format
>>> Series(np.random.randn(3))
0    0.41
1    0.99
2    0.10

or

>>> pd.set_option('display.float_format', '{:.2f}'.format)

edited Oct 12 '18 at 15:03

Steven C. Howell

16,902
15
72
97

answered Oct 10 '17 at 17:12

tfhans

1,761
1
7
4

3

I think using a format string would be more approachable to team members that are less familiar with Python, and might not understand lambda functions. – Steven C. Howell Oct 12 '18 at 15:04

score 46 · Answer 3 · answered Jan 23 '18 at 14:00

46

You can use round function just to suppress scientific notation for specific dataframe:

df1.round(4)

or you can suppress is globally by:

pd.options.display.float_format = '{:.4f}'.format

answered Jan 23 '18 at 14:00

Vlad Bezden

83,883
25
248
179

score 30 · Answer 4 · answered Jun 18 '19 at 11:08

30

If you want to style the output of a data frame in a jupyter notebook cell, you can set the display style on a per-dataframe basis:

df = pd.DataFrame({'A': np.random.randn(4)*1e7})
df.style.format("{:.1f}")

See the documentation here.

answered Jun 18 '19 at 11:08

florestan

4,405
2
14
28

Great. How can this be applied to just a Pandas series? – Bowen Liu Dec 16 '22 at 21:58

score 22 · Answer 5 · answered Nov 19 '20 at 22:00

Setting a fixed number of decimal places globally is often a bad idea since it is unlikely that it will be an appropriate number of decimal places for all of your various data that you will display regardless of magnitude. Instead, try this which will give you scientific notation only for large and very small values (and adds a thousands separator unless you omit the ","):

pd.set_option('display.float_format', lambda x: '%,g' % x)

Or to almost completely suppress scientific notation without losing precision, try this:

pd.set_option('display.float_format', str)

Thanks! The first option fails though under certain cases. – Ethan Dec 05 '21 at 00:21 — Ethan, Dec 05 '21 at 00:21

score 7 · Answer 6 · answered Sep 07 '20 at 17:24

7

I had multiple dataframes with different floating point, so thx to Allans idea made dynamic length.

pd.set_option('display.float_format', lambda x: f'%.{len(str(x%1))-2}f' % x)

The minus of this is that if You have last 0 in float, it will cut it. So it will be not 0.000070, but 0.00007.

answered Sep 07 '20 at 17:24

Full.Of.Life

169
2
10

by far one of the most undervoted answers. – Michael Jan 24 '21 at 21:16
Could you explain a bit what are you trying to do? Because `(lambda x: f'%.{len(str(x%1))-2}f' % x)(0.000070)` returns `'0.000'` (not `'0.00007'`) and `(lambda x: f'%.{len(str(x%1))-2}f' % x)(1.000070)` returns `'1.000070000000000014'` . – Cristian Ciupitu Oct 10 '22 at 00:31
I don't know why You have results like that. It worked for me. My goal was to check the correctness of data in multiple dataframes. Most of it was some long float so it printed as scientific notation. I didn't apply it to every dataframe with some apply function. I used it as an argument in pd.set_options at the begining o file. What I did there I checked decimal places dynamically. I took the rest from substraction by whole number and changed it to string to count the lenght. Then substracted 2, because point/dot is on second position. – Full.Of.Life Oct 10 '22 at 22:17

score 7 · Answer 7 · answered Jul 11 '22 at 09:03

Expanding on this useful comment, here is a solution setting the formatting options only to display the results without changing options permanently:

with pd.option_context('display.float_format', lambda x: f'{x:,.3f}'):
    display(sum_sales_dept)

dept
value1  119,243,300.0
value2  129,306,600.0
value3  107,714,200.0

score 0 · Answer 8 · edited Dec 04 '17 at 17:40

0

If you would like to use the values, say as part of csvfile csv.writer, the numbers can be formatted before creating a list:

df['label'].apply(lambda x: '%.17f' % x).values.tolist()

edited Dec 04 '17 at 17:40

xlm

6,854
14
53
55

answered Dec 04 '17 at 17:18

evil242

31
2

Format / Suppress Scientific Notation from Pandas Aggregation Results

8 Answers8

Linked

Related