0

I am having a very weird situation, where the following code only throws an error, if nothing follows the last command in a jupyter notebook:

import pandas as pd
df = pd.DataFrame({"letters": ["a", "b"]})
df.style.format("{:.2f}")

If any command like a=1 or print("hello world") follows, no error is printed. No error is raised in a normal python script, even without a following line of code. I want it to print an error, since I cannot format strings like that. Why isn't it giving me one? I am expecting to get an error if something doesn't work.

Tested with python 3.8 and 3.9 with jupyter in VS-Code and in Firefox.

Jonathanthesailor
  • 159
  • 1
  • 1
  • 8

1 Answers1

1

I think the short answer is you don't get the error until you try to render df.style.format("{:.2f}") as HTML. Only during trying to display the styler object does it try to use the letters as floats and give the error. If you want it to show the error when df.style.format("{:.2f}") is internal to a code cell, wrap it in display(). (These days in Jupyter you won't need from IPython.display import display but you may elsewhere. You can import that into a Python script, too, if IPython is present. Because you need to be working in Jupyter to use the Pandas dataframe styler object ultimately, I think you'd have Jupyter around. )


What you seeing is specific to how the Pandas styler object display rendering works in conjunction with the last line. (Or as we'll see IPython's display().) You can get clued into this by examining the traceback when you see the ``ValueError: Unknown format code 'f' for object of type 'str'` error you see. In particular this part:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/formats/style.py:379, in Styler._repr_html_(self)
    374 """
    375 Hooks into Jupyter notebook rich display system, which calls _repr_html_ by
    376 default if an object is returned at the end of a cell.
    377 """
    378 if get_option("styler.render.repr") == "html":
--> 379     return self.to_html()
    380 return None

It seems the _repr_html_ is getting triggered when it is at the end of a cell. That relates to this quote from 'Styler Object and HTML' section in Pandas documentation:

"The DataFrame.style attribute is a property that returns a Styler object. It has a _repr_html_ method defined on it so they are rendered automatically in Jupyter Notebook."

So those things mean that when the style is on the last line, it triggers the display() that Jupyter has. When it's internal that isn't happening. Just writing df.style.format("{:.2f}") in the middle of the code just invokes the Style object and the 'settings' it will have. There's no display() being triggered. Essentially there's no error-causing action even though the df.style.format("{:.2f}") line gets run. It's only when the Python kernel tries to render that object in the notebook is it encountering the issue triggering the error to be shown.

Can we test that by demonstrating how to get the error even when it df.style.format("{:.2f}") is invoked internal in a cell?

You can make the error show up when it is internal by adding code to trigger trying to display the styler object. Run this:

import pandas as pd
df = pd.DataFrame({"letters": ["a", "b"]})
display(df.style.format("{:.2f}"))
print("hello world")

You'll see now you get the error. The only change was wrapping df.style.format("{:.2f}") in display().

And so when you have df.style.format("{:.2f}") as the expression to display either by putting it on the last line of a cell or calling display() on it, the Python kernel sees that it is meant to try to represent what is there as HTML, and so it takes the the Style object and applies the _repr_html_ method with the settings set by "{:.2f}" to the dataframe. And so when it tries to apply those settings to render a dataframe that is only made of strings, it cannot do it and you see ValueError: Unknown format code 'f' for object of type 'str'. If you look above that error you'll see all the local code bits that are trying to now handle that.



I eliminated it being the InteractiveShell.ast_node_interactivity setting by the following:

It's not simply that the Interactivity that the last line also has that is getting triggered.
Because that would mean if it was, if you change the settings so that every expression is 'print' as if it was the last on the line by first running:

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

(That change is based on here. )

And then if after running the following after changing to `InteractiveShell.ast_node_interactivity = "all", it would be expected to also give the error, but running the code block below after the one above doesn't result in the error:

import pandas as pd
df = pd.DataFrame({"letters": ["a", "b"]})
df.style.format("{:.2f}")
print("hello world")
Wayne
  • 6,607
  • 8
  • 36
  • 93