-4

I have a sales pandas dataframe, where each row represents a company name, and there are four columns showing the current, minimum, maximum, and average sales amount over the last five years.

I wonder if there's a way to plot the min, max, avg, current horizontal bars inside the dataframe.

Just to give you a concrete example: https://libguides.lib.umanitoba.ca/bloomberg/fixedincome

If you look at the "Range" column, that's precisely what I'm trying to replicate inside the dataframe. I found matplotlib boxplot but I don't think I can plot them inside the dataframe.

Are you aware of any solutions?

jpp
  • 159,742
  • 34
  • 281
  • 339
Andrea
  • 113
  • 1
  • 4
  • 10
  • I didn't get what you mean by "plotting inside dataframe". We never plot anything inside dataframe. We use the values from dataframe and plot them. – Shridhar R Kulkarni Mar 19 '18 at 14:31
  • @ShridharRKulkarni if you look at the attached link, do you see those horizontal lines that show an orange dot to represent the current average, a blue dot that represents the current (most recent) value, and their distance from the extremes of the horizontal line are derived vs the minimum and maximum value. So I'm asking if python has something similar. – Andrea Mar 19 '18 at 14:34
  • This is really unclear. I could imagine you want something similar to [Matplotlib- Creating a table with line plots in cells](https://stackoverflow.com/questions/47779560/matplotlib-creating-a-table-with-line-plots-in-cells) just with boxplots? But please write a clear question with a problem description explaining what you want to achieve and what hinders you obtaining it. Also if this is about pandas dataframes use the respective tags. – ImportanceOfBeingErnest Mar 19 '18 at 14:41
  • @ImportanceOfBeingErnest ok, so more than sharing literally a screenshot I don't know whatelse I can do. in Excel, guys, I'm talking about excel, I can build a macro in order to visualize the minimum, the maximum, the average, and the current value, in a normal table. Now, I do not know if there's a matplotlib code that allows to visualize this data inside the dataframe like the colormap. Hence I tagged python and matplotlib. The plotbox shows more than 4 values and all I simply need is to plot horizontally four values. – Andrea Mar 19 '18 at 14:46
  • If you guys don't understand please avoid useless comments like "this is really unclear" because all you need to do is to look at a graph. – Andrea Mar 19 '18 at 14:48
  • If you don't understand [ask], please avoid asking questions here. – ImportanceOfBeingErnest Mar 19 '18 at 14:49
  • @ImportanceOfBeingErnest even if I believe you've been a little arrogant with your answers, I want to apologize for my answers. I will do my best to be more specific and clear with my questions. Sorry ImportanceOfBeingErnest. – Andrea Mar 19 '18 at 17:04

2 Answers2

1

I am not totally sure what exactly you are looking for, so if you need anything else, please tell me.

I used pandas to create some dummy data and matplotlib for the graphs.

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'current':[3,4,7], 'minimum':[1,3,2], 'maximum':[10,14,11], 'average':[8,5,9]})

#   average  current  maximum  minimum
#0        8        3       10        1
#1        5        4       14        3
#2        9        7       11        2

Now the important part. I somewhat recreated your example from the image. This loop iterates over every row in your dataframe, that is, your companies. The result is as many graphs as you have companies.

  • ax.plot creates a straight line from the minimum value to the maximum value.
  • ax.scatter creates points for the current and average values.

Of course, you have to adjust the graph a bit to make it look like you want it to.

for index,row in df.iterrows(): 
    fig, ax = plt.subplots()
    ax.plot([df['minimum'][index],df['maximum'][index]],[0,0],zorder=0)      
    ax.scatter(df['current'][index],0,zorder=1)
    ax.scatter(df['average'][index],0,zorder=2)

This would be the graph for the first company. enter image description here

Edit (see @Andrea's comment): Putting the plotted data closer together

You can follow the approach above but adjust the style of the graphs.

for index,row in df.iterrows(): 
    fig, ax = plt.subplots(figsize=(7, 0.2)) # adjust the width and height of the graphs
    ax.plot([df['minimum'][index],df['maximum'][index]],[0,0],color='gray',zorder=0)      
    ax.scatter(df['current'][index],0,zorder=1)
    ax.scatter(df['average'][index],0,marker='D',zorder=2)   
    plt.xticks([]) # disable the ticks of the x-axis
    plt.yticks([]) # disable the ticks of the y-axis   
    for spine in plt.gca().spines.values(): # disable the border around the graphs
        spine.set_visible(False)

This looks pretty close to the image you posted in your question. enter image description here

NK_
  • 361
  • 1
  • 4
  • 11
  • this is actually very helpful. Do you think it would be possible to have those plots literally in one column, with no space between each other? – Andrea Mar 19 '18 at 17:03
  • @Andrea I edited my post. Let me know if you need something different. – NK_ Mar 20 '18 at 07:12
  • this is fantastic! One last question please. If you can't help I totally understand. So the final goal would be to use your code and the plots you created with subplots and have them in a table-like format. I checked what ImportanceOfBeingErnest did here: https://stackoverflow.com/questions/47779560/matplotlib-creating-a-table-with-line-plots-in-cells so the question for you is: could we have the graphs you have created as subplots column (col 6) and have in col 1 the name of the company, col 2 the avg, col 3 the value of the min, col 4 the value of the max, col 5 the curr. value? – Andrea Mar 21 '18 at 16:34
  • I'm trying to combine your solution with the one on that link but there are some parts of the code that are really difficult for me to understand. Please don't take it as me being lazy but honestly I'm still not very good with matlplotlib. Thank you in any case. Andrea – Andrea Mar 21 '18 at 16:35
0

Ok, so based on NK_ help and the content available in: Matplotlib- Creating a table with line plots in cells?

I managed to put this together:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

df = pd.DataFrame({'Name':["A","B","C","E","F"],'current':[3,4,7,6,6], 'minimum':[1,3,2,4,1], 'maximum':[10,14,11,7,10], 'average':[8,5,9,5,3]})


data = np.random.rand(100,5)
col1 = df["Name"]
col2 = df["current"]
col2colors = ["red", "g", "r", "r", "r"]
col3 = df["average"]
finalsc = "D+"

fig, axes = plt.subplots(ncols=5, nrows=5, figsize=(6,2.6),
                         gridspec_kw={"width_ratios":[1,1,1,3,3]})
fig.subplots_adjust(0.05,0.05,0.95,0.95, wspace=0.02, hspace=0.05)   #wspace, hspace --> bordi interni grigi della tabella

for ax in axes.flatten():
    ax.tick_params(labelbottom=0, labelleft=0, bottom=0, top=0, left=0, right=0)
    ax.ticklabel_format(useOffset=False, style="plain")
    for _,s in ax.spines.items():
        s.set_visible(True)

border = fig.add_subplot(111)
border.tick_params(labelbottom=0, labelleft=0, bottom=0, top=0, left=0, right=0)
border.set_facecolor("None")

text_kw = dict(ha="center", va="bottom", size=15)
for i,ax in enumerate(axes[:,0]):
    ax.text(0.5, 0.2, col1[i], transform=ax.transAxes, **text_kw)

for i,ax in enumerate(axes[:,1]):
    ax.text(0.5, 0.2, "{:.2f}".format(col2[i]),transform=ax.transAxes, **text_kw)
    ax.set_facecolor(col2colors[i])
    ax.patch.set_color(col2colors[i])

for i,ax in enumerate(axes[:,2]):
    ax.text(0.5, 0.2, "{:.2f}".format(col3[i]),transform=ax.transAxes, **text_kw)

for i,ax in enumerate(axes[:,3]):
    ax.plot(data[:,i], color="green", linewidth=1)

for i,ax in enumerate(axes[:,4]):
    ax.plot([df['minimum'][index],df['maximum'][index]],[0,0],zorder=0)      
    ax.scatter(df['current'][index],0,zorder=1)
    ax.scatter(df['average'][index],0,zorder=2)

plt.show()

To be fully honest, I don't know if the code I put together is the best code I could have used, there are many parts I still have to understand.

Pleas, the last question I would have is: could somebody help me to add to this table a first "row" in which we show in bold the titles of each column? Thanks

Andrea
  • 113
  • 1
  • 4
  • 10