1

When doing data analysis with pandas module in python, I was trying to create a function that can apply the following process to a list of data frames. (Note: P1_Assessment is one of the dataframes that I would like to analyse.)

P1_Assessment[P1_Assessment > 1].sum(axis=0).astype(int).sort_values(ascending = False).plot(kind = 'bar')`

So to analyse a list of data frames in one block of code, I tried to create a function as follows:

def assess_rep(dataframe):
for i in dataframe:
    a = i[i > 1].sum(axis= 0).astype(int).sort_values(ascending = False)
    a.plot(kind = 'bar')
return 

But when I used the function on a list of dataframes, only the analysed result of the last dataframe was returned. The picture of output from the console is attached here for clarity

I tried to search on similar topics on stackoverflow but didn't come across anything, maybe I missed out. Any help on this is greatly appreciated!!

stasiaks
  • 1,268
  • 2
  • 14
  • 31
  • Look up `plt.subplot` to plot all these figures onto same plot. – Haskar P Sep 03 '18 at 13:01
  • Hi Haskar! Could you kindly elaborate in detail how you would go about implementing the subplot function in a for loop as mentioned in the question? Appreciate your help! – Vitamin Kai Sep 05 '18 at 04:05

2 Answers2

0

Your problem is that plot creates a plot, but when you call it again in your loop it overwrites the one plot call before. So what you want to do is save every plot in a list or something or save them as a file with:

 p = a.plot()
 fig = p[0].get_figure()
 fig.savefig("filename.png")

check out savefig and DataFrame.plot edit took from How to save Pandas pie plot to a file?

Sharku
  • 1,052
  • 1
  • 11
  • 24
  • Hi Sharku! Thank you for your prompt reply! I tried your method by putting `def assess_rep(dataframe): for i in dataframe: a = i[i > 1].sum(axis= 0).astype(int).sort_values(ascending = False).plot(kind = 'bar') a.savefig('trial.png') return ` However, I got error message saying that 'AxesSubplot' object has no attribute 'savefig' – Vitamin Kai Sep 03 '18 at 14:08
  • Ah I am sorry I thought that might happen, I investigated a bit more, see the edit. – Sharku Sep 04 '18 at 05:07
  • Sorry the Link is not working it sais: If you're looking for an image, it's probably been deleted or may not have existed at all. – Sharku Sep 05 '18 at 05:08
  • Hi Sharku! Appreciate your help. However, when I implement the code suggested, I am getting another error message saying "'AxesSubplot' object does not support indexing". I've attached the output picture for your reference which can be found here: imgur.com/a/FqDVuNl – Vitamin Kai Sep 05 '18 at 06:45
  • sorry the link wasn't properly formatted just now, it is updated now in the edited comment! – Vitamin Kai Sep 05 '18 at 06:48
  • Hi Sharku! I modified your code so that to create different filename for each loop and multiple images were saved as desired. The only difference is in the savefig command, which I attached here. `fig.savefig("%s.png" % (df))` However, in the output console, only the last graph was outputted in the console. Although this shouldn't be too big a issue, but still will be neat if python can output all graphs in the console window! – Vitamin Kai Sep 07 '18 at 05:09
  • mmh you could try too save every figure object in a List and call fig.show() in the loop. But I think that will just change the plot in the console. So you could introduce a time delay with plt.pause(0.05) to look at each plot one by one. Or add a subroutine that opens all saved pictures at the end of your code. check out this question https://stackoverflow.com/questions/11874767/real-time-plotting-in-while-loop-with-matplotlib – Sharku Sep 07 '18 at 08:46
0

I listed two options.

First option is to plot all dataframes in one figure:

def assess_rep(dataframe_list):
    for df in dataframe_list:
        a = df[df > 1].sum(axis= 0).astype(int).sort_values(ascending = False)
        ax = a.plot(kind = 'bar')
return ax

you can save the figure as a png file by:

ax = assess_rep(dataframe_list)
ax.get_figure().savefig('all_dataframe.png')

Second option is to plot every dataframe seperate and save the figure during the process:

import matplotlib.pyplot as plt
def asses_rep(dataframe_list):
    ax_list = []
    counter = 1
    for df in dataframe_list:
        print(counter)
        fig = plt.figure(counter)
        a = df[df > 1].sum(axis= 0).astype(int).sort_values(ascending = False)
        ax = a.plot(kind='bar', fig=fig)
        ax_list.append(ax)
        ax.get_figure().savefig('single_df_%i.png'%counter)
        counter += 1
    return ax_list
onno
  • 969
  • 5
  • 9
  • Hi, onno! Thank you for your reply! I tried to use both methods, but when I run your code, I am still getting only one graph in the end. I am not sure which part is wrong as your explanation makes sense to me. I have attached the code and the output for your reference . code and output of the first method can be found here imgur.com/a/FxPBlRr ; that of the second method can be found here: imgur.com/a/sQGXLBl – Vitamin Kai Sep 05 '18 at 06:46
  • Hi Kaiwen, that's not what I expected. Can you print the output of the dataframes? print(P1_assesment.head()) – onno Sep 05 '18 at 07:00
  • As the data frame is pretty big with 16 columns, I've attached an summary image of the data frame opened from variable explorer. Hope this helps! https://imgur.com/a/d16qbci – Vitamin Kai Sep 05 '18 at 08:24
  • Can you check if different figures are saved as .png? – onno Sep 05 '18 at 11:28
  • I changed the code. Can you check if the for loop runs at least 2 times? – onno Sep 05 '18 at 12:39
  • Hi Onno! I tried to run it and this time I am getting 'module object is not callable' error message :( Link is here: https://imgur.com/a/6Eigl4n – Vitamin Kai Sep 07 '18 at 03:27
  • how do you import matplotlib? Please use: 'import matplotlib.pyplot as plt' – onno Sep 07 '18 at 10:33