As mentioned in one of the previous answers you can use the command-line json processor jq
to perform this task notably quicker than with nbconvert
. A complete command for getting rid of metadata, outputs and execution counts can be found in this blog post:
jq --indent 1 \
'
(.cells[] | select(has("outputs")) | .outputs) = []
| (.cells[] | select(has("execution_count")) | .execution_count) = null
| .metadata = {"language_info": {"name":"python", "pygments_lexer": "ipython3"}}
| .cells[].metadata = {}
' 01-parsing.ipynb
If desired, you could modify to just clean a specific part of the output, such as execution counts (recursively wherever they occur in the json), and then add this as a git filter:
[filter "nbstrip"]
clean = jq --indent 1 '(.. |."execution_count"? | select(. != null)) = null'
smudge = cat
And add the following to ~/.config/git/attributes
to have the filter applied globally to all your local repos:
*.ipynb filter=nbstripout
There is also nbstripout which is made for this purpose, but it's a bit slower.