17

The problem: I want to interact with Jupyter from another application via Jupyter API, in particular I want to run my notebooks from the app at least (Perfect variant for me is to edit some paragraphs before running it). I've read the API documentation but haven't found what I need.

I've used for that purpose Apache Zeppelin which have the same structure (Notebooks and paragraphs).

Does anybody used Jupyter for the purpose I've just described?

Alexander Yakovlev
  • 305
  • 1
  • 2
  • 5
  • You might wanna have a look at one of those projects, cause those libs/addons do exactly that: https://atom.io/packages/hydrogen OR https://atom.io/packages/jupyter-notebook . I hope that helps. – Elmar Macek Feb 06 '19 at 10:41

3 Answers3

29

Ignoring if the use of Jupyter API is the best solution for the problem (not clearly described in the question), the code below does what you have asked for: it will execute remotely a Jupyter notebook over http and get some results. It is not production ready, it more an example of how it can be done. Did not test it with cells that generate lots of output - think it will need adjustments.

You can also change/edit the code programmatically by altering the code array.

You will need to change the notebook_path, base and headers according to your configuration, see code for details.

import json
import requests
import datetime
import uuid
from pprint import pprint
from websocket import create_connection

# The token is written on stdout when you start the notebook
notebook_path = '/Untitled.ipynb'
base = 'http://localhost:9999'
headers = {'Authorization': 'Token 4a72cb6f71e0f05a6aa931a5e0ec70109099ed0c35f1d840'}

url = base + '/api/kernels'
response = requests.post(url,headers=headers)
kernel = json.loads(response.text)

# Load the notebook and get the code of each cell
url = base + '/api/contents' + notebook_path
response = requests.get(url,headers=headers)
file = json.loads(response.text)
code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 ]

# Execution request/reply is done on websockets channels
ws = create_connection("ws://localhost:9999/api/kernels/"+kernel["id"]+"/channels",
     header=headers)

def send_execute_request(code):
    msg_type = 'execute_request';
    content = { 'code' : code, 'silent':False }
    hdr = { 'msg_id' : uuid.uuid1().hex, 
        'username': 'test', 
        'session': uuid.uuid1().hex, 
        'data': datetime.datetime.now().isoformat(),
        'msg_type': msg_type,
        'version' : '5.0' }
    msg = { 'header': hdr, 'parent_header': hdr, 
        'metadata': {},
        'content': content }
    return msg

for c in code:
    ws.send(json.dumps(send_execute_request(c)))

# We ignore all the other messages, we just get the code execution output
# (this needs to be improved for production to take into account errors, large cell output, images, etc.)
for i in range(0, len(code)):
    msg_type = '';
    while msg_type != "stream":
        rsp = json.loads(ws.recv())
        msg_type = rsp["msg_type"]
    print(rsp["content"]["text"])

ws.close()

Useful links based on which this code is made (that I recommend reading if you want more info):

Note that there is also https://jupyter-client.readthedocs.io/en/stable/index.html, but as far as I could tell it does not support HTTP as a transport.

For reference this works with notebook-5.7.4, not sure about other versions.

vladmihaisima
  • 2,119
  • 16
  • 20
  • 2
    Great code snippet! As a quick note, in case you don't want to inspect the console output to find what the jupyter token is, you can specify the token manually when launching the notebook using an env var like this: `$ JUPYTER_TOKEN=this-is-my-token jupyter notebook` – Martin Zugnoni Nov 21 '19 at 15:25
  • When I try this, `while msg_type != "stream"` seems to be an infinite loop. I can however see messages of type `execute_reply`, but these don't seem to have the code output in them. Any ideas why this might be the case? – Dan Jan 21 '20 at 19:00
  • why are you appending - /api/kernels , I am able to access my notebook on Browser from the base URL if I append the /api/kernels part no page is found . – Infinite Dec 04 '20 at 10:22
  • @Infinite : maybe it would help to mention what version of notebook you are using. The answer is about 2 years old and between the version mentioned in the answer (5.7.4) and today there were around 20 new versions - things might have changed. – vladmihaisima Dec 09 '20 at 16:18
  • How can you ensure that the variables in 'code' will be retained as global variables ? I tried using the above code in a flask app using code mirror but the variable defined in cell one is not retained in cell two. How can we ensure that variable defined in cell one is available in cell two also ? – morelloking Aug 25 '21 at 14:22
  • The variables in the notebook cells are stored in the Jupyter notebook server environment (not related with the above code). I would suggest you to open a new question with the your complete flask app code and other relevant description, without that it will be hard to guess what is happening and what is the best way to achieve it. – vladmihaisima Aug 26 '21 at 07:03
  • you should only execute code cells (and ignore markdown). Thus, the correct line to collect code cells to be executed is: `code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 and c['cell_type']=='code' ]` – Tamas Foldi Jul 04 '23 at 12:52
2

Extending the code by @vladmihaisima

from websocket import create_connection, WebSocketTimeoutException

        while msg_type != "stream":
            try:
                rsp = json.loads(ws.recv())
                print(rsp["msg_type"])
                print(rsp)
                msg_type = rsp["msg_type"]
                if msg_type == "error":
                    raise Exception(rsp['content']['traceback'][0])
            except WebSocketTimeoutException as _e:
                print("No output")
                return
muTheTechie
  • 1,443
  • 17
  • 25
0

I believe that using of remote Jupyter Notebook is over-engineering in your case.

I see good way is pass necessary parameters to python program with well logging.

Max Belousov
  • 353
  • 4
  • 13