18

I have an ipython notebook that runs several steps in a data processing routine and saves information in files along the way. This way, while developing my code (mostly in a separate .py module), I can skip to and run various steps. I'd like to set it up so that I can Cell->run all but only have it execute certain chosen steps that would be easily chosen. e.g., I'd envision defining the steps I want to run in a dict like so:

process = {
    'load files':False,
    'generate interactions list':False,
    'random walk':True,
    'dereference walk':True,
    'reduce walk':True,
    'generate output':True
}

then the steps would run based on this dict. BTW, each step comprises multiple cells.

I think %macro is not quite what I want since anytime I changed anything or restarted the kernel I'd have to redefine the macro, with changing cell numbers.

Is there like a %skip or %skipto magic or something along those lines? Or perhaps a clean way to put at the beginning of cells, if process[<current step>]: %dont_run_rest_of_cell?

Nathan Lloyd
  • 1,821
  • 4
  • 15
  • 19
  • 1
    I have the same need - I use notebooks as a template for automatically generated reports. I want to be able to define which sections of my notebook are executed based on some condition, such as whether a certain input file exists (i.e. if this file is provided, run the next 6 cells). The idea reminds me of #define, #ifdef compiler macros from C-family languages. – Gordon Bean Jun 27 '16 at 20:40

7 Answers7

29

You can create your own skip magic with the help of a custom kernel extension.

skip_kernel_extension.py

def skip(line, cell=None):
    '''Skips execution of the current line/cell if line evaluates to True.'''
    if eval(line):
        return

    get_ipython().ex(cell)

def load_ipython_extension(shell):
    '''Registers the skip magic when the extension loads.'''
    shell.register_magic_function(skip, 'line_cell')

def unload_ipython_extension(shell):
    '''Unregisters the skip magic when the extension unloads.'''
    del shell.magics_manager.magics['cell']['skip']

Load the extension in your notebook:

%load_ext skip_kernel_extension

Run the skip magic command in the cells you want to skip:

%%skip True  #skips cell
%%skip False #won't skip

You can use a variable to decide if a cell should be skipped by using $:

should_skip = True
%%skip $should_skip
Robbe
  • 2,610
  • 1
  • 20
  • 31
  • Cool, but where does one put a "custom kernel extension"? – Arthur Sep 13 '18 at 02:57
  • @Arthur: You can put your extension modules anywhere you want, as long as they can be imported by Python’s standard import mechanism. However, to make it easy to write extensions, you can also put your extensions in extensions/ within the IPython directory. This directory is added to sys.path automatically. – Robbe Sep 13 '18 at 13:03
  • instead of get_ipython().ex(cell) better get_ipython().run_cell(cell) – H.C.Chen Dec 03 '21 at 03:59
8

If you are using nbconvert to execute your notebook, you can write a custom preprocessor that looks at cell metadata to know which cells to execute.

class MyExecutePreprocessor(nbconvert.preprocessors.ExecutePreprocessor):

    def preprocess_cell(self, cell, resources, cell_index):
        """
        Executes a single code cell. See base.py for details.
        To execute all cells see :meth:`preprocess`.

        Checks cell.metadata for 'execute' key. If set, and maps to False, 
          the cell is not executed.
        """

        if not cell.metadata.get('execute', True):
            # Don't execute this cell in output
            return cell, resources

        return super().preprocess_cell(cell, resources, cell_index)

By editing cell metadata, you can specify whether that cell should be executed.

You can get fancier by adding a master dictionary to your notebook metadata. This would look like the dictionary in your example, mapping sections to a boolean specifying whether that section would be called.

Then, in your cell metadata, you can use a "section" keyword mapping to the section ID in your notebook metadata.

When executing nbconvert, you can tell it to use your preprocessor.

See the docs on Notebook preprocessors for more information.

Gordon Bean
  • 4,272
  • 1
  • 32
  • 47
  • I added a few more details when responding to this related question: http://stackoverflow.com/questions/33517900/export-individual-cell-in-ipython-jupyter-notebook. – Gordon Bean Mar 29 '17 at 17:46
  • Thanks @Gordon Bean. What are the most convenient ways to edit cell metadata? Is there some way to do it with a command in a Jupyter notebook? – colorlessgreenidea May 21 '19 at 20:59
  • @colorlessgreenidea - you can edit the metadata via the notebook editor. I'm not sure if there is a way to edit the metadata using a command (though it sounds like a great question for SO ;). – Gordon Bean May 21 '19 at 22:40
5

I am new to Jupyter Notebook and am loving it. I had heard of IPython before but didn't look into it seriously until a recent consulting job.

One trick my associate showed me to disable blocks from execution is to change them from "Code" type to "Raw NBConvert" type. This way I sprinkle diagnostic blocks through my notebook, but only turn them on (make them "Code") if I want them to run.

This method isn't exactly dynamically selectable in a script, but may suit some needs.

RufusVS
  • 4,008
  • 3
  • 29
  • 40
4

Adding to what Robbe said above (I can't comment because I'm new), you could just do the following in your first cell if you don't want to create a custom extension that you might just forget about:

def skip(line, cell=None):
    '''Skips execution of the current line/cell if line evaluates to True.'''
    if eval(line):
        return

    get_ipython().ex(cell)

def load_ipython_extension(shell):
    '''Registers the skip magic when the extension loads.'''
    shell.register_magic_function(skip, 'line_cell')

def unload_ipython_extension(shell):
    '''Unregisters the skip magic when the extension unloads.'''
    del shell.magics_manager.magics['cell']['skip']
    
    
load_ipython_extension(get_ipython())
1

You can use nbconvert and tags option in metadata: In my case I edited the cell metadata:

{
    "deletable": true,
    "colab_type": "code",
    "id": "W9i6oektpgld",
    "tags": [
        "skip"
    ],
    "colab": {},
    "editable": true
}

Create a preprocess.py file.

from nbconvert.preprocessors import Preprocessor

class RemoveCellsWithNoTags(Preprocessor):
    def preprocess(self, notebook, resources):
        executable_cells = []
        for cell in notebook.cells:
            if cell.metadata.get('tags'):
                if "skip" in cell.metadata.get('tags'):
                    continue
            executable_cells.append(cell)
        notebook.cells = executable_cells
        return notebook, resources

Then export notebook:

jupyter nbconvert --Exporter.preprocessors=[\"preprocess.RemoveCellsWithNoTags\"] --ClearOutputPreprocessor.enabled=True --to notebook --output=getting-started-keras-test getting-started-keras.ipynb
gogasca
  • 9,283
  • 6
  • 80
  • 125
0

Explicit is always better that implicit. Simple is better than complicated. So why don't use plain python?

With one cell per step you can do:

if process['load files']:
    load_files()
    do_something()

and

if process['generate interactions list']:
    do_something_else()

If you want to stop the execution when a particular step is skipped you could use:

if not process['reduce walk']:
    stop
else:
    reduce_walk()
    ...

stop is not a command so it will generate an exception and stop the execution when using Cell -> Run all.

You can also make conditional steps like:

if process['reduce walk'] and process['save output']:
    save_results()
    ...

But, as a rule of thumb, I wouldn't make conditions that are much more complex than that.

user2304916
  • 7,882
  • 5
  • 39
  • 53
0

Or from another point of view, you can skip the cells that you do not want to run (i.e. by adding the following code at the first line of your cell that needs to be skipped).

%%script echo skipping
Roxy
  • 1,015
  • 7
  • 20