0

I have the following dataframe, while using jupyter notebook on win10.

d = {'col1': [1000000, 200000000000000], 'col2': [3, 4]}
df2 = pd.DataFrame(data=d)
df2.describe()

which leads to the following output:

         col1           col2
count   2.000000e+00    2.000000
mean    1.000000e+14    3.500000
std     1.414214e+14    0.707107
min     1.000000e+06    3.000000
25%     5.000000e+13    3.250000
50%     1.000000e+14    3.500000
75%     1.500000e+14    3.750000
max     2.000000e+14    4.000000

I know it is only a 'optical' isssue but somehow I dislike it, when the first row does not have the same number format, it is not super easy to compare it. Is there a way to avoid this displaying and instead have the output, somehow python use the same format for the column and not for the row right?:

         col1           col2
count   2.000000        2.000000
mean    1.000000e+14    3.500000
std     1.414214e+14    0.707107
min     1.000000e+06    3.000000
25%     5.000000e+13    3.250000
50%     1.000000e+14    3.500000
75%     1.500000e+14    3.750000
max     2.000000e+14    4.000000

Or is it maybe possible to have the same format per row instead of column?

Can I only change the float display for all together? In this way: Customized float formatting in a pandas DataFrame , or can pandas do it in a smart way?

PV8
  • 5,799
  • 7
  • 43
  • 87
  • 2
    Possible duplicate of [Customized float formatting in a pandas DataFrame](https://stackoverflow.com/questions/42735541/customized-float-formatting-in-a-pandas-dataframe) ... another: [Pandas data frame. Change float format. Keep type “float”](https://stackoverflow.com/questions/54378389/pandas-data-frame-change-float-format-keep-type-float) – wwii Aug 08 '19 at 13:28
  • 1
    I think you are looking for this maybe ? https://stackoverflow.com/questions/55394854/how-to-change-the-format-of-describe-output – Rnne Msly Aug 08 '19 at 13:34
  • 1
    What about `df2.describe().astype(str)`? If it's only for display then who cares that it's a string – ALollz Aug 08 '19 at 13:38
  • the approach is not bad, can I somehow round the numbers before to two decimal digits? – PV8 Aug 08 '19 at 13:41
  • `df2.describe().astype(object)` maybe? – 0asa Aug 08 '19 at 14:08

2 Answers2

3
pd.options.display.float_format = lambda x: f'{x:<12.6e}' if x < .1 or x >= 10 else f'{x:<12.6f}'

          col1         col2
count 2.000000     2.000000    
mean  1.000000e+14 3.500000    
std   1.414214e+14 0.707107    
min   1.000000e+06 3.000000    
25%   5.000000e+13 3.250000    
50%   1.000000e+14 3.500000    
75%   1.500000e+14 3.750000    
max   2.000000e+14 4.000000
Stef
  • 28,728
  • 2
  • 24
  • 52
1

Try something like:

import pandas as pd
pd.options.display.float_format = '{:,.2e}'.format
d = {'col1': [1000000, 200000000000000], 'col2': [3, 4]}
df2 = pd.DataFrame(data=d)
df2.describe()

should give:

          col1     col2
count 2.00e+00 2.00e+00
mean  1.00e+14 3.50e+00
std   1.41e+14 7.07e-01
min   1.00e+06 3.00e+00
25%   5.00e+13 3.25e+00
50%   1.00e+14 3.50e+00
75%   1.50e+14 3.75e+00
max   2.00e+14 4.00e+00

Note you can customize some aspects of pandas behavior and other display-related options. Have a look at the available options here.

0asa
  • 224
  • 1
  • 8
  • This is not excatly what i am targeting, can I avoid the +00 in the first row? – PV8 Aug 08 '19 at 13:36
  • The `to_string` method (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_string.html) allows to customize formats but they only apply to columns unfortunately. – 0asa Aug 08 '19 at 14:14