19

I'm currently using Jupyter ipython notebook and the file I am working with has a lot of code. I am just curious as to how many lines of code there exactly are in my file. It is hard to count since I have separated my code into many different blocks.

For anyone who is experienced with jupyter notebook, how do you count how many total lines of code there are in the file?

Thanks!

Edit: I've figured out how to do this, although in a pretty obscure way. Here's how: download the jupyter notebook as a .py file, and then open the .py file in software like Xcode, or whatever IDE you use, and count the lines of code there.

Cynthia
  • 377
  • 2
  • 4
  • 10
  • You may specify if blank lines or comments are relevant in the total, and also what you've tried – PRMoureu Jul 28 '17 at 20:49
  • You can show the line numberings in the Jupyter Notebook with CTRL + ML : https://stackoverflow.com/questions/10979667/showing-line-numbers-in-ipython-jupyter-notebooks – Carlo Mazzaferro Jul 28 '17 at 21:42
  • Yes, I have the line numbers shown but I would like to count the total number of lines in my file (including comments and blank lines). – Cynthia Jul 28 '17 at 22:38

3 Answers3

29

This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line:

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    cells = load(open(nb))['cells']
    return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(run(argv[1:]))

So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb to get results.

Jessime Kirk
  • 654
  • 1
  • 6
  • 13
5

The same can be done from shell if you have a useful jq utility:

jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l

Also, you can use grep to filter lines further, e.g. to remove blank lines: | grep -e ^\"\\\\n\"$ | wc -l

Kirill Voronin
  • 156
  • 2
  • 9
3

The answer from @Jessime Kirk is really good. But it seems like the ipynb file shouldn't have Chinese character. So I optimized the code as below.

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    with open(nb, encoding='utf-8') as data_file:
        cells = load(data_file)['cells']
        return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(r"This file can count the code lines number in .ipynb files.")
    print(r"usage:python countIpynbLine.py xxx.ipynb")
    print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
    print(r"it can also count multiple code.ipynb lines.")
    print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
    print(r"start to count line number")
    print(run(argv[1:]))
常耀耀
  • 31
  • 3