18

I have an ipython notebook with an embedded image from my local drive. I was expecting it to be embedded in the JSON along with the output of code cells, but when I distributed the notebook, the image did not appear to users. What is the recommended way (or ways) to embed an image in a Notebook, so that it doesn't disappear if users rerun code cells, clear cell output, etc.?

The notebook system caches images included with ![label](image.png), but they last only until the python "kernel" serving the notebook is restarted. If I rename the image file on disk, I can close and reopen the notebook and it still shows the image; but it disappears when I restart the kernel.

Edit: If I generate an image as code cell output and then export the notebook to html, the image is embedded in the html as encoded data. Surely there must be a way to hook into this functionality and load the output into a markdown (or better yet "raw nbconvert") cell?

from IPython.display import Image 
Image(filename='imagename.png')

will be exported (with ipython nbconvert) to html that contains the following:

<div class="output_png output_subarea output_execute_result">
<img src="...
</div>

However, even when I manually embedded this snippet into a markdown cell, I couldn't get the image to display. What am I doing wrong?

Update (2020)

Apparently, the problem has (finally!) been addressed in the newer notebook / Jupyter versions: as of 2018 (thanks for the link @Wayne), the html sanitizer will accept an embedded html image, as in <img src="...> . Markdown image syntax also accepts images as embedded data, so there are two ways to do this. Details in these helpful answers:

alexis
  • 48,685
  • 16
  • 101
  • 161
  • Pity that nobody has answered this question in all this time! Is there perhaps a solution by now? – alexis Apr 22 '15 at 14:10
  • I've run into the exact same problem. Apparently the reason why entering ` – tel Oct 01 '15 at 06:03
  • Yeah, that's reasonable, and I had expected some sort of filter. Actually I'd be surprised if originally there was no filter at all-- it's more likely that was just strengthened in version 2. But the question is still, _is_ there some method that gets past the filters? – alexis Oct 01 '15 at 15:32
  • I bet there is, but I'm pretty sure it would qualify as a 0-day exploit in IPython/[Google Caja](https://github.com/google/caja) (the HTML sanitizer) :) – tel Oct 01 '15 at 18:10

5 Answers5

11

Are you happy to use an extra code cell to display the image? If so, use this:

from IPython.display import Image
Image(filename="example.png")

The output cell will have the raw image data embedded in the .ipynb file so you can share it and the image will be retained.

Note that the Image class also has a url keyword, but this will only link to the image unless you also specify embed=True (see the documentation for details). So it's safer to use the filename keyword unless you are referring to an image on a remote server.

I'm not sure if there is an easy solution if you require the image to be included in a Markdown cell, i.e. without a separate code cell to generate the embedded image data. You may be able to use the python markdown extension which allows dynamically displaying the contents of Python variables in markdown cells. However, the extension generates the markdown cells dynamically, so in order to retain the output when sharing the notebook you will need to run ipython nbconvert --to notebook original_notebook.ipynb --output preprocessed_notebook using the preprocessor pymdpreprocessor.py as mentioned in the section "Installation". The generated notebook then has the data embedded in the markdown cell as an HTML tag of the form <img src="data:image/png;base64,..."> so you can delete the corresponding code cell from preprocessed_notebook.ipynb. Unfortunately, when I tried this the contents of the <img> tag weren't actually displayed in the browser, so not sure if this is a viable solution. :-/

A different option would be to use the Image class in a code cell to generate the image as above, and then use nbconvert with a custom template to remove code input cells from the notebook. See this thread for details. However, this will strip all code cells from the converted notebook, so it may not be what you want.

cilix
  • 1,262
  • 1
  • 14
  • 14
  • Thanks! I don't _require_ the image to be in a markdown cell, but using code is (a) distracting, since the code cell cannot be hidden; and (b) more crucially, it's unsafe because these are notebooks for programming practice and the user can be expectedto clear cell outputs now and then. – alexis May 04 '15 at 17:04
  • PS. Thanks for the nbconvert thread... I've been gradually piling up my own conversion scripts because nbconvert's guts are completely opaque (and not very well documented). Maybe this will lead me to a source of better explanations. – alexis May 04 '15 at 17:08
  • Not sure if this is relevant, but maybe [this example](https://github.com/maxalbert/auto-exec-notebook) can also be helpful to better understand `nbconvert` and automated notebook execution. I haven't looked at it in detail, but in my limited experience the nbconvert machinery seems to be relatively tidy in the latest version of IPython (and potentially simpler than in previous versions due to the simplification of the notebook format itself). – cilix May 04 '15 at 19:03
  • 1
    Is this outdated information now, since https://stackoverflow.com/a/53723360/8508004 works? Or maybe an `img` tag in markdown only works in some places? I am presently using MyBinder served sessions, like from [here](https://github.com/binder-examples/requirements) or Azure notebooks & an example notebook [here](https://gist.github.com/fomightez/3f478dc0e059e620a0f27368e9ce96df) works. (base64 in `img` tag in markdown cell also render in [github](https://gist.github.com/fomightez/3f478dc0e059e620a0f27368e9ce96df) & [nbviewer](https://gist.github.com/fomightez/3f478dc0e059e620a0f27368e9ce96df). – Wayne Dec 20 '19 at 19:35
  • 1
    @Wayne, you are absolutely right! Apparently the html sanitizer has been changed to accept embedded images (presumably it has been confirmed to be safe). I'll update the question. – alexis Dec 10 '20 at 09:15
3

The reason why the

<img src="...

tag doesn't do anything when you put it in a markdown cell is because IPython uses an HTML sanitizer (something called Google Caja) that screens out this type of tag (and many others) before it can be rendered.

The HTML sanitizer in IPython can be completely disabled by adding the following line to your custom.js file (usually located at ~/.ipython/profile_default/static/custom/custom.js):

iPython.security.sanitize_html = function (html) { return html; };  

It's not a great solution though, as it does create a security risk, and it doesn't really help that much with distribution.

Postscript:
The ability to render base64 encoded strings as images != obvious security concern, so there should be a way for the Caja people to eventually allow this sort of thing through (although the related feature request ticket was first opened back in 2012, so don't hold your breath).

tel
  • 13,005
  • 2
  • 44
  • 62
  • 1
    That's a good lead! Notebooks have a concept of "trusted" notebook, implemented (I think) as a cryptographic key once a notebook has been inspected by the user. The reasonable thing to do would be to relax html sanitizing for trusted notebooks. Any ideas on how this could be set up? – alexis Oct 01 '15 at 18:43
  • 1
    @alexis *le sigh* I tried that too. Clicking though the `File -> Trust Notebook` menu dialog doesn't seem to affect HTML sanitization one way or the other. You're right that it should, though. I imagine that it will have to be implemented in the IPython codebase. Perhaps you or I will get around to submitting a pull request? – tel Oct 01 '15 at 20:49
  • Not me, I'd have no idea how to make this conditional on the trusted property (or where to put it, actually). And I'm not sure that completely disabling sanitization is right-- there should be a larger whitelist of features that are let through. If you're sufficiently interested to do it, I'd be very curious to see if the developers take up the idea! – alexis Oct 02 '15 at 12:07
  • 1
    I'd be surprised to learn there was no config option to specify allowed html tags for Google Caja ('HTML Sanitizer' in the debug output in the Chrome JS console). Most html sanitization libraries have some option for white-listing tags (e.g. forum frameworks, bullet boards, etc) – tommytwoeyes Nov 22 '15 at 20:33
  • 2
    Update: Well, color me surprised. According to [this iPython documentation](https://ipython.org/ipython-doc/3/notebook/security.html#our-security-model) HTML and JavaScript in Markdown cells are NEVER trusted. Clicking through `File -> Trust Notebook` as described by @tel above will only allow HTML and JavaScript __output__ to be trusted. Still, I can't believe there is _no_ way to accomplish this without hacking iPython code. In my circumstances, I simply want to embed some interactive graphs from [Desmos.com](http://desmos.com) into some notebooks. – tommytwoeyes Nov 22 '15 at 20:39
  • 2
    This has been a source of a lot of frustration. Is there an issue on github to bump for this? Is there any progress that anyone knows of? I appreciate the security concern, but this basically means I'm back to powerpoint and word to share my data/figures with people who don't use ipython, i.e. my profs. – aeolus Jun 28 '16 at 05:21
  • Heads up: The `' syntax works now! As of 2018 or so, the html sanitizer will let it through. Hurray. – alexis Dec 10 '20 at 09:31
1

If using the IPython HTML() function to output raw HTML, you can embed a linked image in base64 inside an <img> tag using the following method:

import base64
import requests
from IPython.core.display import HTML

def embedded_image(url):
    response = requests.get(url)
    uri = ("data:" + 
       response.headers['Content-Type'] + ";" +
       "base64," + str(base64.b64encode(response.content).decode('utf-8')))
    return uri

# Here is a small example. When you export the notebook as HTML,
# the image will be embedded in the HTML file 
html = f'<img src="{embedded_image("https://upload.wikimedia.org/wikipedia/commons/5/56/Kosaciec_szczecinkowaty_Iris_setosa.jpg")}" />'
HTML(html)

UPDATE: As pointed out by @alexis, this doesn't actually answer the question correctly, this will not allow users to re-run cells and have images persist (this solution only allows one to embed the images into exports).

gbmhunter
  • 1,747
  • 3
  • 23
  • 24
  • 1
    Thanks but this approach is already mentioned in my _question_ (see the **Edit** section). I needed something that will be visible in the live notebook, not in the exported html view. – alexis Oct 31 '19 at 12:50
  • Argh woops sorry classic example of me not reading the question fully! – gbmhunter Nov 01 '19 at 16:00
1

I figured out that replacing the image URL in the ![name](image) with a base64 URL, similar to the ones found above, can embed an image in a markdown container.

Example markdown:

![smile]()
id01
  • 1,491
  • 1
  • 14
  • 21
  • 1
    That's really cool, thanks! I don't know how long this feature has been supported, but it certainly works now. Perhaps this was not possible when this question was first asked and answered? In any event, the html sanitizer now allows ``, as well. Life is good! :-) – alexis Dec 10 '20 at 09:08
  • @alexis I read an answer about how to include a image by HTTP URL, then was like "hey, what if I put a base64 URL instead?" And it turned out to work. So I'm spreading the word :P – id01 Dec 11 '20 at 04:45
0

As of Jupyter Notebook 5, you can attach image data to cells, and refer to them from the cell via attachment:<image-file-name>. See the menu Edit > Insert Image, or use drag and drop.

Unfortunately, when converting notebooks with attached (embedded) images to HTML, those images will not show up.

To get them into the HTML code, you can use (for instance) nbtoolbelt. It will replace those attachment: references by data: with the image data embedded in the img tag.

wstomv
  • 761
  • 1
  • 6
  • 13