71

I have a DataFrame in pandas where some of the numbers are expressed in scientific notation (or exponent notation) like this:

                  id        value
id              1.00    -4.22e-01
value          -0.42     1.00e+00
percent        -0.72     1.00e-01
played          0.03    -4.35e-02
money          -0.22     3.37e-01
other            NaN          NaN
sy             -0.03     2.19e-04
sz             -0.33     3.83e-01

And the scientific notation makes what should be an easy comparison, needlessly difficult. I assume it's the 21900 value that's screwing it up for the others. I mean 1.0 is encoded. ONE!

This doesn't work:

np.set_printoptions(supress=True) 

And pandas.set_printoptions doesn't implement suppress either, and I've looked all at pd.describe_options() in despair, and pd.core.format.set_eng_float_format() only seems to turn it on for all the other float values, with no ability to turn it off.

zhangyangyu
  • 8,520
  • 2
  • 33
  • 43
user1244215
  • 1,016
  • 1
  • 10
  • 17
  • 2
    Dd you fix the typo in `np.set_printoptions(suppress=True)` - two p's in suppress? – smci Nov 16 '16 at 14:50
  • I believe this question should be reopened because it has the best answer and was asked earlier than the one it was closed as a duplicate of. – Josiah Yoder Jul 20 '23 at 17:12
  • 1
    @JosiahYoder, Just because it is closed, doesnt mean it is deleted. It just stops more answers. – Rohit Gupta Jul 29 '23 at 14:22

6 Answers6

102

quick temporary: df.round(4)

global: pd.options.display.float_format = '{:20,.2f}'.format

The :20 means the total width should be twenty characters, padded with whitespace on the left if it would otherwise be shorter. You can use simply '{:,.2f}' if you don't want to specify the number.

The .2f means that there should be two digits after the decimal point, even if they are zeros.

Josiah Yoder
  • 3,321
  • 4
  • 40
  • 58
citynorman
  • 4,918
  • 3
  • 38
  • 39
  • 3
    try this experiment: `print('{:20,.8f}'.format(12333344445676.0123456789))`, then adjust the 20 to 40 and see what happens and I think you'll have your answer. you can use this same style formatter on numbers in a print statement. – TMWP Jul 26 '17 at 04:12
  • Fwiw you might also want to [suppress the data type output](https://stackoverflow.com/questions/24295451/suppress-descriptive-output-when-printing-pandas-dataframe) – citynorman Oct 06 '17 at 20:58
  • 1
    This is the only really working solution for me. Works like a charm in Jupyter. – Bouncner Feb 15 '18 at 08:09
  • 2
    Agree with @Bouncner, I also tried many solutions but found that only this solution can print a specific number of decimal points for float values in pandas as expected. – Good Will Apr 22 '18 at 21:00
  • 1
    Great! For scientific notation, use `'{:e}'.format` – Eduardo Reis May 10 '21 at 18:30
15

Your data is probably object dtype. This is a direct copy/paste of your data. read_csv interprets it as the correct dtype. You should normally only have object dtype on string-like fields.

In [5]: df = read_csv(StringIO(data),sep='\s+')

In [6]: df
Out[6]: 
           id     value
id       1.00 -0.422000
value   -0.42  1.000000
percent -0.72  0.100000
played   0.03 -0.043500
money   -0.22  0.337000
other     NaN       NaN
sy      -0.03  0.000219
sz      -0.33  0.383000

check if your dtypes are object

In [7]: df.dtypes
Out[7]: 
id       float64
value    float64
dtype: object

This converts this frame to object dtype (notice the printing is funny now)

In [8]: df.astype(object)
Out[8]: 
           id     value
id          1    -0.422
value   -0.42         1
percent -0.72       0.1
played   0.03   -0.0435
money   -0.22     0.337
other     NaN       NaN
sy      -0.03  0.000219
sz      -0.33     0.383

This is how to convert it back (astype(float)) also works here

In [9]: df.astype(object).convert_objects()
Out[9]: 
           id     value
id       1.00 -0.422000
value   -0.42  1.000000
percent -0.72  0.100000
played   0.03 -0.043500
money   -0.22  0.337000
other     NaN       NaN
sy      -0.03  0.000219
sz      -0.33  0.383000

This is what an object dtype frame would look like

In [10]: df.astype(object).dtypes
Out[10]: 
id       object
value    object
dtype: object
Jeff
  • 125,376
  • 21
  • 220
  • 187
  • Actually the column was int64, that had then been df.corr() 'd which returns all float64s – user1244215 Jul 23 '13 at 03:36
  • 1
    if you have NaN in the column it could NOT have been int64; only float64 or object – Jeff Jul 23 '13 at 08:39
  • df.corr() returns NaNs when the stddev of a column is 0. – user1244215 Jul 23 '13 at 21:47
  • they may have started out as ``Int64`` but they are ``float64`` by definition. However, if they were actually object to begin with, then they still might be ``object`` – Jeff Jul 23 '13 at 21:56
7

Try this which will give you scientific notation only for large and very small values (and adds a thousands separator unless you omit the ","):

pd.set_option('display.float_format', lambda x: '%,g' % x)

Or to almost completely suppress scientific notation without losing precision, try this:

pd.set_option('display.float_format', str)
dabru
  • 786
  • 8
  • 8
3

quick fix without rounding:

pd.options.display.float_format = '{:.0f}'.format
Reihan_amn
  • 2,645
  • 2
  • 21
  • 21
2

If you would like to use the values as formated string in a list, say as part of csvfile csv.writier, the numbers can be formated before creating a list:

df['label'].apply(lambda x: '%.17f' % x).values.tolist()
evil242
  • 31
  • 2
0

I tried all the options like

  1. pd.options.display.float_format = '{:.4f}'.format
  2. pd.set_option('display.float_format', str)
  3. pd.set_option('display.float_format', lambda x: f'%.{len(str(x%1))-2}f' % x)
  4. pd.set_option('display.float_format', lambda x: '%.3f' % x)

but nothing worked for me.

so while assigning the variable / value (var1) to a variable (say num1) I used round(val,5).

num1 = round(var1,5)

This is a crude method as you have to use this round function in each assignment. But this ensures you control on how it happens and get what you want.