34

I have an IPython notebook where I've accidentally dumped a huge output (15 MB) that crashed the notebook. Now when I open the notebook and attempt to delete the troublesome cell, the notebook crashes again—thus preventing me from fixing the problem and restoring the notebook to stability.

The best fix I can think of is manually pasting the input cells to a new notebook, but is there a way to just open the notebook without any outputs?

alexwlchan
  • 5,699
  • 7
  • 38
  • 49
wil3
  • 2,877
  • 2
  • 18
  • 22
  • 1
    I tried the script posted below, it was super slow (didn't finish within 15 minutes for two different notebooks, one with size 24 MB and the other one 137 MB). I found this [python library nbstripout](https://github.com/kynan/nbstripout) which did the job within a second. – Verena Haunschmid May 14 '19 at 13:36

7 Answers7

43

you can use cli to clear outputs

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace Notebook.ipynb
Shumaila Ahmed
  • 769
  • 5
  • 11
24

There is this nice snippet (that I use as a git commit hook) to strip the output of an ipython notebook:

#!/usr/bin/env python

def strip_output(nb):
    for ws in nb.worksheets:
        for cell in ws.cells:
            if hasattr(cell, "outputs"):
                cell.outputs = []
            if hasattr(cell, "prompt_number"):
                del cell["prompt_number"]


if __name__ == "__main__":
    from sys import stdin, stdout
    from IPython.nbformat.current import read, write

    nb = read(stdin, "ipynb")
    strip_output(nb)
    write(nb, stdout, "ipynb")
    stdout.write("\n")

You can easily make it a bit nicer to use, currently you'd have to call it as

strip_output.py < my_notebook.ipynb > my_notebook_stripped.ipynb
filmor
  • 30,840
  • 6
  • 50
  • 48
  • 1
    As I am using version 4.1.0, I encountered some API deprecation warnings when I run this script. However, the script still works if you ignore those warnings. – Edward Fung May 03 '16 at 02:31
14

If you are running jupyter 4.x, you will get some API deprecation warnings when running filmor's script. Although the script still works, I update the script a bit to remove the warnings.

#!/usr/bin/env python

def strip_output(nb):
    for cell in nb.cells:
        if hasattr(cell, "outputs"):
            cell.outputs = []
        if hasattr(cell, "prompt_number"):
            del cell["prompt_number"]


if __name__ == "__main__":
    from sys import stdin, stdout
    from nbformat import read, write

    nb = read(stdin, 4)
    strip_output(nb)
    write(nb, stdout, 4)
    stdout.write("\n")
Community
  • 1
  • 1
Edward Fung
  • 426
  • 8
  • 16
4

As for later versions of jupyter, there is a Restart Kernel and Clear All Outputs... option that clears the outputs but also removed the variables.

Kernel Options

tartaruga_casco_mole
  • 1,086
  • 3
  • 21
  • 29
  • 2
    The OP specifically asked for a solution that does not require opening the notebook. – Verena Haunschmid May 14 '19 at 11:44
  • @VerenaHaunschmid, I don't see any word indicating "OP asking for a solution that does not require opening the notebook." OP dumped a huge output that crashed the notebook and **can still open the notebook** but crashes again when **attempting to delete the troublesome cell**. 'Restart Kernel and Clear All Outputs...' clear the outputs and will prevent his notebooks from crashing when attempting to delete the crashing cell. The downside to this solution is that he loses his other outputs, but this requirement was not stated in the question, so I think this at least constitute as an answer. – tartaruga_casco_mole May 14 '19 at 13:23
  • You're right sorry. I probably interpreted too much of my problem into the question, but I still think it does not really answer the question since it says "open the notebook **without** any outputs"... – Verena Haunschmid May 15 '19 at 13:30
  • @VerenaHaunschmid, I think it does answer the question at least in OP's case. "Now when I open the notebook and attempt to delete the troublesome cell, the notebook crashes again". He was clearly able to open the notebook (same in my case). There might be people like you who came to this question not being able to open the notebook, but there are also people like me who only had issues when dealing with specific cells. It might not answer your question, but it certainly helped in my case, and that is the primarily reason I posted the answer - to help people who run into same issue as I did. – tartaruga_casco_mole May 15 '19 at 15:11
  • Yes, that is why I apologized. Unfortunately Stackoverflow does not let me remove my downvote "unless the answer is edited". (I'd gladly remove my downvote if I could - in case you want to make a minor edit to the post to allow it) – Verena Haunschmid May 28 '19 at 05:55
  • 1
    @VerenaHaunschmid Edited. – tartaruga_casco_mole Jun 12 '19 at 17:17
4

Here is a further modification from @Edward Fung's answer that will output the cleaned notebook to a new file rather than rely on stin and stout

from nbformat import read, write

def strip_output(nb):
    for cell in nb.cells:
        if hasattr(cell, "outputs"):
            cell.outputs = []
        if hasattr(cell, "prompt_number"):
            del cell["prompt_number"]

nb = read(open("my_notebook.ipynb"), 4)
strip_output(nb)
write(nb, open("my_notebook_cleaned.ipynb", "w"), 4)
wfgeo
  • 2,716
  • 4
  • 30
  • 51
3

Using the --ClearOutputPreprocessor, you can reduce the size of your notebook file due to the outputs.

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace sample.ipynb

Note that --clear-output is the broken command like below:

jupyter nbconvert --clear-output --inplace sample.ipynb

In my case, I tried to see the answer to this question, but I found out that it is a command that cannot remove output.

1

I am not able to post a commenet, so feel free to edit/move to @Shumaila Ahmed answer.

I had to use quotes on the file path, as:

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace 'Notebook.ipynb'

Works like charm on Ubuntu 21.04, thanks!