6

When saving my jupyter Notebooks with Git, I'd like to remove:

  • Outputs (answered in https://stackoverflow.com/a/58004619/3733974)

  • ExecuteTime in metadata (code below in .ipynb)

     "metadata": {
      "ExecuteTime": {
       "end_time": "2020-07-09T11:09:35.842718Z",
       "start_time": "2020-07-09T11:09:35.837714Z"
      },
      "tags": [
       "parameters"
      ]
     }
    

I am able to remove all the metadata with --ClearMetadataPreprocessor.enabled=True but I want to remove only ExecuteTime metadata.

How to update my current command in .git/config?

[filter "strip-notebook-output"]
    clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"

How to pass arguments to ClearMetadataPreprocessor?

mbh86
  • 6,078
  • 3
  • 18
  • 31
  • Unfortunately the ClearMetadataPreprocessor does not have any more options than to remove all the metadata. See also the pull request for this feature https://github.com/jupyter/nbconvert/pull/805 – onno Oct 23 '20 at 08:38
  • You can also use `jq` for this, as described here https://stackoverflow.com/a/74104683/2166823 – joelostblom Oct 18 '22 at 19:07

1 Answers1

0

This can be done using preserve_cell_metadata_mask option of the ClearMetadataPreprocessor. As far as I can tell, this feature was added in nbconvert v6.0.

Quote from jupyter nbconvert --help-all:

--ClearMetadataPreprocessor.preserve_cell_metadata_mask=<set-item-1>...
Indicates the key paths to preserve when deleting metadata across both cells and notebook metadata fields. Tuples of keys can be passed to preserved specific nested values
Default: set()

For the OP example the command should be like this:

jupyter nbconvert --ClearOutputPreprocessor.enabled=True \
--ClearMetadataPreprocessor.enabled=True \
--ClearMetadataPreprocessor.preserve_cell_metadata_mask='[("tags")]' \
--to=notebook --stdin --stdout --log-level=ERROR
Anton Babkin
  • 595
  • 1
  • 8
  • 12