20

I have an IPython notebook which is several megabytes big although the code inside is just about 100 lines. I think it is that huge because I load several images inside.

I would like to add this notebook to a git repository. However, I don't want to upload something that big which can easily be generated again.

Is it possible to save just the code of an IPython notebook to reduce its size?

Thomas K
  • 39,200
  • 7
  • 84
  • 86
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • http://stackoverflow.com/questions/18734739/using-ipython-notebooks-under-version-control may be related. See the section about stripping the output. – cel Jun 14 '16 at 09:01
  • Another experimental tool that might help: [recombinecm](https://github.com/takluyver/recombinecm). It saves the notebook as two files, and the idea is that you put the clean code-only file in version control, and not the file with all the outputs. – Thomas K Jun 14 '16 at 17:14

3 Answers3

32

You can try following steps since it worked for me:

Select the "Cell" -> then select "All Outputs" -> There you will find "Clear" option select that.

enter image description here
And then save the file.

This will reduce the size of your file (From MBs to kbs). It will also reduce the time to load the notebook next time you open it in your browser.

As per my understanding this will clear all the output created after execution of the code. Since Notebook is holding code+images+comments in addition to this its also holding the out put in that file therefore it will increase the size of the notebook.

Yogesh Awdhut Gadade
  • 2,498
  • 24
  • 19
  • 1
    This reduced mine from 200mb to a few kb. Thanks! – azizbro Oct 02 '19 at 01:29
  • 2
    In addition to this, widgets can easily add several MB of data to a notebook. Widget data can be cleared with dropdown Widgets > Clear Notebook Widget State – Gman Jul 18 '20 at 21:07
  • 1
    Thank you so much @Yogesh, I was starting to hate Jupyter because of that issue. – BND Aug 08 '20 at 06:15
  • Nothing helped me until I used @Gman's method. Images and output clearing didn't make a dent, even after it *looked* like there were no widgets present anymore. Widget clearing changed several notebooks from over 100MB each to 20k each. – thorr18 Apr 29 '21 at 03:56
1

Now you generate a simple script linked to the notebook with jupytext which others can rerun.

If you need to keep the images within (because, for example, you are sharing the notebook with someone who does not want to/can not rerun it) you might want to try to reduce the images.

I found this module ipynbcompress which seems to do exactly this, but so far I could not install it.

nocibambi
  • 2,065
  • 1
  • 16
  • 22
1

I run into the exact same problem with one of my notebooks, which I solved by changing my df to df.head(5). I did this instead of clearing all outputs as I still wanted to show on GitHub how my code changed data inside the columns in my df.

You also can run !ls -lh in the last cell of your notebook to check size of your notebook before saving. This will give you an idea if you need to clear outputs/replace df with df.head()/remove images in order to reduce the size and be able to save on the GitHub.

The smell of roses
  • 117
  • 1
  • 2
  • 10