Automate graph generation with Seaborn using Pandas dataframe

Question

Seaborn facetgrid lets me generate multiple graphs for one person. However, I cannot modify the code to go through a loop to repeat the same process for 20 different people.

It breaks when I try to call each different person's dataframe. The problem is that I am calling a string with the dataframe name instead of calling the dataframe itself. How do I fix that?

I started with one very large dataframe, I made separate dataframes for each person from that larger dataframe. When I try a loop through each of the individual people's dataframes I am not able to call the dataframe itself.

These seem to the the lines with the issues:

for i in Person_u:
    output_file=(i + '.png')
    input_file=(i + '.csv')
    title=i
    db=('df_' + i)

Below I have included both the code the code that works for 1 person and the code that does not work for looping through multiple people.

# import libraries ...
# import data from csv file ...
#create data frame from values in the csv file
df = pd.read_csv(input_file, sep=',', delimiter=None, header='infer', 
    names=['LH', 'RevID', 'OrigID', 'Person', 'Date', 'File', 
    'Threshold', 'StepSize', 'RevNum', 'WL', 'RevPos', 'ExpNum', 'Light', 'ThExp'], 
    usecols=['OrigID', 'Person', 'Date', 'Threshold', 'RevNum', 'WL', 'RevPos', 'ExpNum', 'ThExp'], 
    engine='python', skiprows=1, infer_datetime_format=True)

# By Experiment
# Experiment 1, 2, 3, 4 (hundreds of rows, etc.)
df_TLR_1 = df.loc[(df.Person == 'TLR') & (df.ExpNum == 1)]
df_KJE_1 = df.loc[(df.Person == 'KJE') & (df.ExpNum == 1)]
df_NMP_2 = df.loc[(df.Person == 'NMP') & (df.ExpNum == 2)]
df_SFO_2 = df.loc[(df.Person == 'SFO') & (df.ExpNum == 2)]
df_MTC_3 = df.loc[(df.Person == 'MTC') & (df.ExpNum == 3)]
df_ZBL_3 = df.loc[(df.Person == 'ZBL') & (df.ExpNum == 3)]
df_MTC_4 = df.loc[(df.Person == 'MTC') & (df.ExpNum == 4)]
df_TLR_1 = df.loc[(df.Person == 'RJI') & (df.ExpNum == 4)]

Person_u = df.Person.unique()
ExpNum_u = df.ExpNum.unique()
WL_u = df.WL.unique()
ThExp_u = df.ThExp.unique()

# seaborn set stylesns.set(style="ticks")
grid = sns.FacetGrid(df_TLR_1, col="WL", hue="ThExp", col_wrap=4, size=4)
grid.map(plt.axhline, y=0, ls=":", c=".5")
# Draw a horizontal line showing min max constraints of staircase
if df_TLR_1.iloc[0,7] == 1:
    grid.map(plt.axhline, y=-60, ls=":", c=".5")
    grid.map(plt.axhline, y=40, ls=":", c=".5")
    grid.map(plt.plot, "RevNum", "RevPos", marker="o", ms=4)
    grid.set(xticks=np.arange(13), yticks=[-65, -60, -40, -20, 0, 20, 40, 45], xlim=(-.5, 12.5), ylim=(-65, 45))
elif df_TLR_1.iloc[0,7] == 4:
    grid.map(plt.axhline, y=-50, ls=":", c=".5")
    grid.map(plt.axhline, y=50, ls=":", c=".5")
    grid.map(plt.plot, "RevNum", "RevPos", marker="o", ms=4)
    grid.set(xticks=np.arange(13), yticks=[-60, -40, -20, 0, 20, 40, 60], xlim=(-.5, 12.5), ylim=(-65, 65))
else:
    print('Error. Experiment Number not 1-4.')
# Draw a line plot to show reversals of staircase
# Adjust the arrangement of the plots
grid.fig.tight_layout(w_pad=.5)
this_name=df_TLR_1.iloc[0,1]
th_experiment=df_TLR_1.iloc[0,8]
this_experiment=th_experiment[-4:8]

#plt.suptitle(df_TLR_1.iloc[0,1] + df_TLR_1.iloc[0,8], fontsize=20)
plt.suptitle(this_name + ' ' + this_experiment, fontsize=20, ha='right')
plt.savefig(this_name + ' ' + this_experiment + '.png')
plt.show()

When I try to change it to run through each unique person, I am unable to append the three letters and experiment number, to df_XXX_X. For example, changing df_RJI_1 to df_MTC_3 etc.

for i in Person_u:
    output_file=(i + '.png')
    input_file=(i + '.csv')
    title=i
    db=('df_' + i)
    #seaborn set style
    sns.set(style="ticks")
    grid = sns.FacetGrid(db, col="WL", hue="ThExp", col_wrap=5, size=4)
    grid.map(plt.axhline, y=0, ls=":", c=".5")
    # Draw a horizontal line showing min max constraints of staircase
    if db[0,7] == 1:
        grid.map(plt.axhline, y=-40, ls=":", c=".5")
        grid.map(plt.axhline, y=60, ls=":", c=".5")
    elif db[0,7] == 4:
        grid.map(plt.axhline, y=-50, ls=":", c=".5")
        grid.map(plt.axhline, y=50, ls=":", c=".5")
    else:
        print('Error. Experiment Number not 1-4.')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-2b9785282fe1> in <module>()
      6     #seaborn set style
      7     sns.set(style="ticks")
----> 8     grid = sns.FacetGrid(db, col="WL", hue="ThExp", col_wrap=5, size=4)
      9     grid.map(plt.axhline, y=0, ls=":", c=".5")
     10     # Draw a horizontal line showing min max constraints of staircase

c:\users\rijekah\appdata\local\programs\python\python35\lib\site-packages\seaborn\axisgrid.py in __init__(self, data, row, col, hue, col_wrap, sharex, sharey, size, aspect, palette, row_order, col_order, hue_order, hue_kws, dropna, legend_out, despine, margin_titles, xlim, ylim, subplot_kws, gridspec_kws)
    235             hue_names = None
    236         else:
--> 237             hue_names = utils.categorical_order(data[hue], hue_order)
    238 
    239         colors = self._get_palette(data, hue, hue_order, palette)

TypeError: string indices must be integers

This is an example of the graph when it does work:

# seaborn set stylesns.set(style="ticks")
grid = sns.FacetGrid(df_TLR_1, col="WL", hue="ThExp", col_wrap=3, size=6)
grid.map(plt.axhline, y=0, ls=":", c=".5")
# Draw a horizontal line showing min max constraints of staircase
if df_TLR_1.iloc[0,7] == 1:
    grid.map(plt.axhline, y=-60, ls=":", c=".5")
    grid.map(plt.axhline, y=40, ls=":", c=".5")
    grid.map(plt.plot, "RevNum", "RevPos", marker="o", ms=4)
    grid.set(xticks=np.arange(13), yticks=[-65, -60, -40, -20, 0, 20, 40, 45], xlim=(-.5, 15.5), ylim=(-65, 45))
elif df_TLR_1.iloc[0,7] == 4:
    grid.map(plt.axhline, y=-50, ls=":", c=".5")
    grid.map(plt.axhline, y=50, ls=":", c=".5")
    grid.map(plt.plot, "RevNum", "RevPos", marker="o", ms=4)
    grid.set(xticks=np.arange(13), yticks=[-60, -40, -20, 0, 20, 40, 60], xlim=(-.5, 12.5), ylim=(-65, 65))
else:
    print('Error. Experiment Number not 1-4.')

# Adjust the arrangement of the plots
grid.fig.tight_layout(w_pad=.5)
this_name=df_TLR_1.iloc[0,1]
th_experiment=df_TLR_1.iloc[0,9]
this_experiment=th_experiment[-4:8]

# add figure title and save figure
plt.suptitle(this_name + ' ' + this_experiment, fontsize=20, ha='right')
plt.savefig(this_name + ' ' + this_experiment + '.png')

The issue is that the db variable you are creating is a string. You want it to be the dataframe you are referencing, but it is being set as a string with the same name as the dataframe. — C Haworth, Jun 26 '18 at 19:44
Yes! How do I make it the dataframe itself instead of the string? — Rebecca Ijekah, Jun 27 '18 at 00:23

score 0 · Answer 1 · answered Jun 27 '18 at 17:37

0

You are creating a string instead of the actual variable names. Using the eval method will fix this.

Instead of what you currently have for the db line, change it to

db = eval('df_'+i)

and that should fix your problem.

answered Jun 27 '18 at 17:37

C Haworth

659
3
12

1

Almost never is "eval" the best answer, and even when it is, there should be a large asterisk reminding readers that it is to be used sparingly, if ever. A much better answer would be to just use `df.loc` with logic about the person -- exactly how the original subsets were created! eg `db = df.loc[df.Person == i]` – kevinsa5 Jun 27 '18 at 18:33
Are there any resources you can point me to on why eval is bad? I did not know – C Haworth Jun 28 '18 at 13:28
1

https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice -- https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html -- http://stupidpythonideas.blogspot.com/2013/05/why-evalexec-is-bad.html – kevinsa5 Jun 28 '18 at 16:39

Automate graph generation with Seaborn using Pandas dataframe

1 Answers1