1

I have a program which outputs between 1 to 4 pandas data frames, each with the structure below:

        a    b
time              
2008  11.61  11.99
2009  12.54  10.66
2010  13.64  12.34
2011  14.02  13.20

In each case the rows may increase if I add years and columns may increase if I add cities but in each of the four dataframes different databases return results for the same years and the same cities.

I'm tring to write a function that can automate the step where I combine all data frames and create a large figure representing all of them.

To do it with 2 dataframes, I would write:
df1 = "pandas function goes here"
df2 = "pandas function goes here"
fig, ([ax1, ax2]) = plt.subplots(1, 2, figsize=(10, 5))
df1.plot(ax=ax1)
df2.plot(ax=ax2)
plt.show()

To do it with 3 dataframes, I would write:
df1 = "pandas function goes here"
df2 = "pandas function goes here"
df3 = "pandas function goes here"
fig, ([ax1, ax2, ax3]) = plt.subplots(1, 3, figsize=(15, 5))
df1.plot(ax=ax1)
df2.plot(ax=ax2)
df3.plot(ax=ax2)
plt.show()

Noticing a pattern, I've tried to automate it with

ax_list = []
dataframe_list = []
plot_num = 4
for i in range(plot_num):
    exec(f'dataframe_{i} = "pandas function goes here"')
    dataframe_list.append(f'dataframe_{i}')
    ax_list.append(f'ax{i}')
    if i == plot_num - 1:
        exec(f'fig, ({exec(", ".join(ax_list))}) = plt.subplots(1, {(plot_num + 1)}, figsize=({(plot_num + 1) * 5}, 5))')
        for x in range(len(dataframe_list)):
            # print(f'{dataframe_list[i]}')
            exec(f'{dataframe_list[x]}.plot()')
plt.show()

I'm getting the error below

File "3.py", line x (the one that starts with exec(f'fig)), in f'fig, ({exec(", ".join(ax_list))}) = plt.subplots(1, {(plot_num + 1)}, figsize=({(plot_num + 1) * 5}, 5))') File "", line 1, in NameError: name 'ax0' is not defined

Please help so that I can automate the pandas dataframes all displayed in one single figure. (Note, I'm not sharing the pandas functions for brevity sake, I can share them if you think its relevant)

petezurich
  • 9,280
  • 9
  • 43
  • 57
bradfreely
  • 11
  • 2

1 Answers1

0

One way to do this is using Pandas MultIndex to assemble each frame into a single "multi-dimensional" dataframe. Another way would be to collect these frames into a Python list.

Seeing and using the pattern as you've done is a great idea. However, using exec is almost always a bad idea; you need to write code to generate code, and that usually ends up being fragile: difficult to extend and difficult to debug.

Below is an approach using MultiIndex; the resulting figures are at the bottom.

You can see the 5-frame dataset doesn't look so good, mostly because of the legend. You can fix that as described here if it's a problem.

import string

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


# make one dataframe
def make_frame(year0=2008, year1=2011, ncities=2):
    index = pd.Index(np.arange(year0, year1+1), name='time')
    cities = list(string.ascii_lowercase[:ncities])

    data = np.random.uniform(low=9, high=15, size=(len(index), ncities))

    return pd.DataFrame(data, index=index, columns=cities)

# make a dataset of a few dataframes, assemble them with a MultiIndex
def make_dataset(nframes=4, year0=2008, year1=2011, ncities=2):
    keys = np.arange(nframes)
    return pd.concat((make_frame(year0, year1, ncities)
                      for k in keys),
                     keys=keys,
                     names=['frame'])

# plot a dataset, each frame on a new axis
def plot_dataset(df):
    nframes = len(df.index.levels[0])
    fig, axs = plt.subplots(nframes, sharex=True)
    for ax, (key, frame) in zip(axs, df.groupby('frame')):
        frame = frame.droplevel(0)
        frame.plot(ax=ax)
        ax.set_title(f'{key}')
    fig.tight_layout()
    return ax


if __name__ == '__main__':
    plot_dataset(make_dataset(nframes=2, year0=2000, year1=2010, ncities=3))
    plot_dataset(make_dataset(nframes=5, year0=1990, year1=2022, ncities=4))
    plt.show()

enter image description here

enter image description here

Rory Yorke
  • 2,166
  • 13
  • 13
  • Thanks but I guess the absence of data kind of confuses me. That and I've never used multiindex before. If it's not too much of a bother, could you please demonstrate on my own code? I posted it here: https://privatebin.net/?abc2a83e53318bf2#AyCNYQ9pmUfWp3SP67vg8RfE1MAnmrch2fcA2JTMN2Pf – bradfreely Nov 05 '22 at 07:33