0

I was hoping somebody would be able to help me. I am trying to store a list of saved images from MatPlotLib as a dataframe (or a list) and then add it to an existing dataframe (effectively creating small barcharts for each entry in the dataframe e.g. databars).

I have managed to save the images successfully with a loop. There are 242 images. How can I show these images in a column in a dataframe. I want it to be easy to append it to my existing dataframe to show visually the number of zero values in this dataset. My code gives errors that it NoneType object is not iterable.

This is my code. (Top half just here for clarification as to what q1 and q2 are.)

Thanks.

import csv
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import sys

q1 = pandas.read_csv("data\q1.csv") #dataframe
q1.apply(lambda x: x.str.strip() if x.dtype == "object" else x) #strip whitespace
q1 = q1.dropna()
code = q1.loc[:,"Code"]
region = q1.loc[:,"Region"]
name = q1.loc[:,"Name"]
heads = list(q1.columns.values) #creates list of header values

nz = (q1 == 0).sum(axis=1) #count number of zero values in rows
q2 = q1[['Code','Region','Name']]
q2 = q2.assign(nz=nz.values)

samples=[]
y=1
for val in q2['nz']:
    val = val/q2['nz'].max() * 100

    plt.barh(val, width = val, color="blue")
    plt.xlim((0,100))
    plt.yticks([0])
    plt.axis('off')

    x = plt.savefig("value" + str(y) + ".png", bbox_inches='tight')
    samples.append(x)
    plt.close()


    y = y + 1

imgdf = pandas.DataFrame.from_records(samples)
q3 = q2.append(imgdf)
shinobilou
  • 11
  • 5
  • You are saving your image to disk as e.g. "value1.png". How should the dataframe know what it is supposed to contain? Do you want to store the filename? Or the png image itself? – ImportanceOfBeingErnest Sep 01 '18 at 09:39
  • I assumed the easiest thing to do would be to have the images themselves stored and then put them into a dataframe. If only the filename is stored how will the image be made? – shinobilou Sep 01 '18 at 09:55
  • If the filename was stored, you could display the image with that filename in the dataframe. I would invite you to look at [this question](https://stackoverflow.com/questions/46107348/how-to-display-image-stored-in-pandas-dataframe/46112269#46112269) and [this question](https://stackoverflow.com/questions/47038538/insert-matplotlib-images-into-a-pandas-dataframe/47043380#47043380). – ImportanceOfBeingErnest Sep 01 '18 at 10:01
  • Ok. I am new to pandas and this is quite difficult for me to understand. Thank you for your answer. I have tried to model the code in one of your other answers you provided into my problem, but now I am faced with other problems. Updating the question to reflect the new code. – shinobilou Sep 01 '18 at 11:10
  • Not sure the edited code is of any use. This first link was meant for you to understand your options. The second link is how you could implement those. – ImportanceOfBeingErnest Sep 01 '18 at 11:20
  • I have changed it back. I don't understand your code or what you are trying to tell me. sorry. – shinobilou Sep 01 '18 at 11:34
  • It's sure not easy for starters. But I suppose one would give the exact same answer here, so that wouldn't help either, if you don't understand it. The good thing is that the linked code is runnable, so you may just play around with it, change some bits and pieces and see what it does, such that you may then step by step adapt it to your data. – ImportanceOfBeingErnest Sep 01 '18 at 11:51

1 Answers1

0

If you are working in a jupyter notebook, then you can use the HTML display to show the images.

# Some imports
import base64
import pandas as pd

from PIL import Image
from io import BytesIO
from IPython.display import HTML

pd.set_option('display.max_colwidth', -1)

def get_thumbnail(path):
    """
    Output a 150x150 sized PIL Image
    """
    i = Image.open(path)
    i.thumbnail((150, 150), Image.LANCZOS)
    return i

def image_base64(im):
    """
    Convert to base64 to be given as the src field of img in HTML
    """
    if isinstance(im, str):
        im = get_thumbnail(im)
    with BytesIO() as buffer:
        im.save(buffer, 'jpeg')
        return base64.b64encode(buffer.getvalue()).decode()

def image_formatter(im):
    return f'<img src="data:image/jpeg;base64,{image_base64(im)}">'

# Skipping some of your code
image_paths = []
for val in q2['nz']:
    #... Do somethings here
    x = plt.savefig("value" + str(y) + ".png", bbox_inches='tight')
    plt.close()

    image_paths.append("value" + str(y) + ".png")

    y = y + 1

q2["images_paths"] = pd.Series(image_paths).values
q2["image"] = q2.image_paths.map(lambda f: get_thumbnail(f))

# Display PIL Images embedded in the dataframe
HTML(q2.to_html(formatters={"image": image_formatter}, escape=False))
Deepak Saini
  • 2,810
  • 1
  • 19
  • 26
  • In how far is this different from [the linked answer](https://stackoverflow.com/questions/47038538/insert-matplotlib-images-into-a-pandas-dataframe/47043380#47043380)? Or if it is, why answer it here and not at the linked question? – ImportanceOfBeingErnest Sep 01 '18 at 12:14
  • @ImportanceOfBeingErnest, sorry but I didn't see the provided link. Should I delete the ans? – Deepak Saini Sep 01 '18 at 12:16
  • I don't know. As said, there might be some bits that are better then in the existing answer in which case moving this to the linked one may be beneficial for future readers. Seen that the OP has problems understanding the linked answer, any answer given here would probably need to explain each step in much more detail. So if you want to do that, keeping a well commented answer here would actually add value. – ImportanceOfBeingErnest Sep 01 '18 at 12:22
  • Thank you. I like this. I like that it shows where my code fits in with it. I am not working in Jupyter Notebook though, would this work in Python 3.5 using pandas and pyqt5? – shinobilou Sep 01 '18 at 12:27
  • @shinobilou, ```q2.to_html(...)``` bit will give you simple html code. You can render it anywhere. I show you an example of rendering it on jupyter using ```HTML(string containing html_code)```, because I was coding the ans on jupyter. – Deepak Saini Sep 01 '18 at 12:35