1

I have an sqlite database setup with some data. I have imported it through sql statements via pandas:

df1 = pd.read_sql_query("Select avg(Duration),keyword,filename from keywords group by keyword,filename order by filename", con)

The data looks as follows:

Based on this I want to construct a stacked bar graph that looks like this:

I've tried various different solutions including matplotlib, pandas.plot but im unable to successfully construct this graph.

Thanks in advance.

TheMaster
  • 45,448
  • 6
  • 62
  • 85
Lee
  • 59
  • 2
  • 10

2 Answers2

1

This snippet should work:

import pandas as pd
import matplotlib.pyplot as plt

data = [[2, 'A', 'output.xml'], [5, 'B', 'output.xml'], 
    [3, 'A', 'output.xml'], [2, 'B', 'output.xml'], 
    [5, 'C', 'output2.xml'], [1, 'B', 'output2.xml'],
    [6, 'C', 'output.xml'], [3, 'C', 'output2.xml'],
    [3, 'A', 'output2.xml'], [3, 'B', 'output.xml'],
    [2, 'C', 'output.xml'], [1, 'C', 'output2.xml']
   ] 
df = pd.DataFrame(data, columns = ['duration', 'Keyword', 'Filename'])

df2 = df.groupby(['Filename', 'Keyword'])['duration'].sum().unstack('Keyword').fillna(0)
df2[['A','B', 'C']].plot(kind='bar', stacked=True)

It is similar to this question with the difference that I sum the values of the the concerned field instead of counting.

FabioL
  • 932
  • 6
  • 22
1

1.You just have to use:

ax=df.pivot_table(index='fillname',columns='keyword',values='avg(duration)').plot(kind='bar',stacked=True,figsize=(15,15),fontsize=25)
ax.legend(fontsize=25)

2. Example

df=pd.DataFrame()
df['avg(duration)']=[7,4,5,9,3,2]
df['keywoard']=['a','b','c','a','b','c']
df['fillname']=['out1','out1','out1','out2','out2','out2']
df

2.1 Output df example:

   avg(duration)    keywoard    fillname
0   7               a           out1
1   4               b           out1
2   5               c           out1
3   9               a           out2
4   3               b           out2
5   2               c           out2

2.2 Drawing

ax=df.pivot_table(index='fillname',columns='keywoard',values='avg(duration)').plot(kind='bar',stacked=True,figsize=(15,15),fontsize=25)
ax.legend(fontsize=25)

2.3 Output image example: enter image description here

3. In addiccion using:

#set ylim
plt.ylim(-1, 20)
plt.xlim(-1,4)
#grid on
plt.grid()
# set y=0
ax.axhline(0, color='black', lw=1)
#change size of legend
ax.legend(fontsize=25,loc=(0.9,0.4))
#hiding upper and right axis layout
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
#changing the thickness
ax.spines['bottom'].set_linewidth(3)
ax.spines['left'].set_linewidth(3)
#setlabels
ax.set_xlabel('fillname',fontsize=20,color='r')
ax.set_ylabel('avg(duration)',fontsize=20,color='r')
#rotation
plt.xticks(rotation=0)

enter image description here

ansev
  • 30,322
  • 5
  • 17
  • 31