dataframe.describe() suppress scientific notation

Question

How do I suppress scientific notation output from dataframe.describe():

contrib_df["AMNT"].describe()

count    1.979680e+05
mean     5.915134e+02
std      1.379618e+04
min     -1.750000e+05
25%      4.000000e+01
50%      1.000000e+02
75%      2.500000e+02
max      3.000000e+06
Name: AMNT, dtype: float64

My data is of type float64:

contrib_df["AMNT"].dtypes

dtype('float64')

So what do you want instead? `.describe` returns a `DataFrame`, so you can simply use `.drop` to remove rows you don't want. If you just want one thing like `count` you can use `.count` by itself. Or you can create your own `describe` function to only return whatever you are interested in. — Kartik, Oct 31 '16 at 18:03
http://stackoverflow.com/a/20937592/1577947 using something like `pd.options.display.float_format = '{:.2f}'.format`? — Jarad, Oct 31 '16 at 18:16
@Jarad Perfect. Please post it as an answer and I'll accept it. — mfabi, Nov 05 '16 at 13:32
@Jarad! Please post it as answer to be accepted by @mfabi as he said before. This should be the write way to get rid of scientific numbers which appears in pandas and are displayed by default. Thank you!! — Elias, May 22 '21 at 11:26

Ash Upadhyay · Accepted Answer · 2020-11-02T17:02:27.543

123

For single column:

contrib_df["AMNT"].describe().apply(lambda x: format(x, 'f'))

For entire DataFrame (as suggested by @databyte )

df.describe().apply(lambda s: s.apply('{0:.5f}'.format))

For whole DataFrame (as suggested by @Jayen):

contrib_df.describe().apply(lambda s: s.apply(lambda x: format(x, 'g')))

As the function describe returns a data frame, what the above function does is, it simply formats each row to the regular format. I wrote this answer because I was having a though, in my mind, that was ** It's pointless to get the count of 95 as 95.00000e+01** Also in our regular format its easier to compare.

Before applying the above function we were getting

count    9.500000e+01
mean     5.621943e+05
std      2.716369e+06
min      4.770000e+02
25%      2.118160e+05
50%      2.599960e+05
75%      3.121170e+05
max      2.670423e+07
Name: salary, dtype: float64

After applying, we get

count          95.000000
mean       562194.294737
std       2716369.154553
min           477.000000
25%        211816.000000
50%        259996.000000
75%        312117.000000
max      26704229.000000
Name: salary, dtype: object

edited Nov 02 '20 at 17:02

answered Nov 09 '17 at 16:44

Ash Upadhyay

1,796
2
15
20

17

for anyone trying to do this on a dataframe and not a series, it's: `contrib_df.describe().apply(lambda s: s.apply(lambda x: format(x, 'g')))` – Jayen Aug 10 '19 at 03:33
@Jayen - any idea how to round it up / down? – SCool Aug 13 '19 at 17:54
1

@SCool - I believe `x` is a normal python float so you should be able to use `format(math.ceil(x), 'g')` – Jayen Aug 14 '19 at 01:17
@Jayen won't using `g` defeat the point? From the docs: "This rounds the number to p significant digits and then **formats the result in either fixed-point format or in scientific notation, depending on its magnitude.**" – Joseph Garvin Nov 14 '19 at 05:30
2

This didn't work for me, got `unsupported format string passed to Series.__format__` – Joseph Garvin Nov 14 '19 at 05:31
@JosephGarvin yes, sorry i can't edit it now. i wanted `g` for my case but the answer to the question should be `f`. i was just trying to show how to do it to a dataframe. – Jayen Nov 14 '19 at 09:04
1

Without having to nest lambdas: `df.describe().apply(lambda s: s.apply('{0:.5f}'.format))` – databyte Oct 30 '20 at 12:52

dataframe.describe() suppress scientific notation

1 Answers1

Linked