5

I have a Jupyter Notebook running. I want to be able to access the source of the current Jupyter Notebook from within Python. My end goal is to pass it into ast.parse so I can do some analysis on the user's code. Ideally, I'd be able to do something like this:

import ast
ast.parse(get_notebooks_code())

Obviously, if the source code was an IPYNB file, there'd be an intermediary step of extracting the code from the Python cells, but that's a relatively easy problem to solve.

So far, I've found code that will use the list_running_servers function of the IPython object in order to make a request and match up kernel IDs - this gives me the filename of the currently running notebook. This would work, except for the fact that the source code on disk may not match up with what the user has in the browser (until you save a new checkpoint).

I've seen some ideas involving extracting out data using JavaScript, but that requires either a separate cell with magic or calling the display.Javascript function - which fires asynchronously, and therefore doesn't allow me to pass the result to ast.parse.

Anyone have any clever ideas for how to dynamically get the current notebooks source code available as a string in Python for immediate processing? I'm perfectly fine if I need to make this be an extension or even a kernel wrapper, I just need to get the source code somehow.

Austin Cory Bart
  • 2,159
  • 2
  • 19
  • 32
  • I agree, it would be nice if there were something that ties into Jupyter/IPython natively in Python. Something with an interface like `jupyter.notebook.dumps()`. I haven't found it yet either. – Scott H Oct 22 '19 at 14:30

2 Answers2

5

Well, this isn't exactly what I wanted, but here's my current strategy. I need to run some Python code based on the user's code, but it doesn't actually have to be connected to the user's code directly. So I'm just going to run the following magic afterwards:

%%javascript
// Get source code from cells
var source_code = Jupyter.notebook.get_cells().map(function(cell) {
    if (cell.cell_type == "code") {
        var source = cell.code_mirror.getValue();
        if (!source.startsWith("%%javascript")) {
            return source;
        }
    }
}).join("\n");
// Embed the code as a Python string literal.
source_code = JSON.stringify(source_code);
var instructor_code = "student_code="+source_code;
instructor_code += "\nimport ast\nprint(ast.dump(ast.parse(student_code)))\nprint('Great')"
// Run the Python code along with additional code I wanted.
var kernel = IPython.notebook.kernel;
var t = kernel.execute(instructor_code, { 'iopub' : {'output' : function(x) {
    if (x.msg_type == "error") {
        console.error(x.content);
        element.text(x.content.ename+": "+x.content.evalue+"\n"+x.content.traceback.join("\n"))
    } else {
        element.html(x.content.text.replace(/\n/g, "<br>"));
        console.log(x);
    }
}}});
Austin Cory Bart
  • 2,159
  • 2
  • 19
  • 32
1

What about combining https://stackoverflow.com/a/44589075/1825043 and https://stackoverflow.com/a/54350786/1825043 ? That gives something like

%%javascript
IPython.notebook.kernel.execute('nb_name = "' + IPython.notebook.notebook_name + '"')

and

import os
from nbformat import read, NO_CONVERT

nb_full_path = os.path.join(os.getcwd(), nb_name)
with open(nb_full_path) as fp:
    notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for no_cell, cell in enumerate(code_cells):
    print(f"####### Cell {no_cell} #########")
    print(cell['source'])
print("")

I get

####### Cell 0 #########
%%javascript
IPython.notebook.kernel.execute('nb_name = "' + IPython.notebook.notebook_name + '"')

####### Cell 1 #########
import os
from nbformat import read, NO_CONVERT

nb_full_path = os.path.join(os.getcwd(), nb_name)
with open(nb_full_path) as fp:
    notebook = read(fp, NO_CONVERT)
cells = notebook['cells']
code_cells = [c for c in cells if c['cell_type'] == 'code']
for no_cell, cell in enumerate(code_cells):
    print(f"####### Cell {no_cell} #########")
    print(cell['source'])
    print("")
Christian O'Reilly
  • 1,864
  • 3
  • 18
  • 33
  • Doesn't this rely on the latest version of the code being saved before being executed? – Austin Cory Bart Feb 01 '22 at 09:10
  • Yes, Indeed @AustinCoryBart. You can save the notebook from within itself though if you want. Something like `from IPython.display import display, Javascript; display(Javascript('IPython.notebook.save_checkpoint();'))`, from https://stackoverflow.com/a/57814673/1825043 – Christian O'Reilly Feb 01 '22 at 19:02
  • Is there a way to guarantee that the save finishes before the execution loads the file? I seem to recall going down this path and that being a major issue. – Austin Cory Bart Feb 02 '22 at 20:06
  • I am not sure if any of these functions run in a separate thread. As long as they don't, you should be fine... but I can't say for sure without more investigation. – Christian O'Reilly Feb 03 '22 at 00:58