Introducing a second y axis into a relplot() call with multiple plots

Question

The Problem

I have 2 dataframes which I combine and then melt with pandas. I need to multi-plot them (as below) and the code needs to be scalable. They consist of 2 variables which form the 'key' column below ('x' and 'y' here), across multiple 'stations' (just 2 here, but needs to be scalable). I've used relplot() to be able to multi-plot the two variables on each graph, and different stations on separate graphs.

Is there any way to maintain this format but introduce a 2nd y axis to each plot? 'x' and 'y' need to be on different scales in my actual data. I've seen examples where the relplot call is stored with y = 1st variable, and a 2nd lineplot call is added for the 2nd variable with ax.twinx() included in it. So in example below, 'x' and 'y' would each have a y axis on the same graph.

How would I make that work with a melted dataframe (e.g. below) where 'key' = 2 variables and 'station' can be length n? Or is the answer to scrap that df format and start again?

Example Code

The multi-plot as it stands:

import numpy as np
np.random.seed(123)
date_range = pd.period_range('1981-01-01','1981-01-04',freq='D')
x = np.random.randint(1, 10, (4,2))
y = np.random.randint(1, 10, (4,2))
x = pd.DataFrame(x, index = date_range, columns = ['station1','station2'])
y = pd.DataFrame(y, index = date_range + pd.to_timedelta(1, unit="D"), columns = ['station1','station2'])

#keep information where each data point comes from
x["key"], y["key"] = "x", "y"
#moving index into a column 
x = x.reset_index()
y = y.reset_index()
#and changing it to datetime values that seaborn can understand
#necessary because pd.Period data is used
x["index"] = pd.to_datetime(x["index"].astype(str))
y["index"] = pd.to_datetime(y["index"].astype(str))

#combining dataframes and reshaping 
df = pd.concat([x, y]).melt(["index", "key"], var_name="station", value_name="station_value")

#plotting
fg = sns.relplot(data=df, x = "index", y = "station_value", kind = "line", hue = "key", row = "station")

#shouldn't be necessary but this example had too many ticks for the interval
from matplotlib.dates import DateFormatter, DayLocator
fg.axes[0,0].xaxis.set_major_locator(DayLocator(interval=1))
fg.axes[0,0].xaxis.set_major_formatter(DateFormatter("%y-%m-%d"))

plt.show()

BigBen · Accepted Answer · 2022-02-10T19:00:42.083

2

You could relplot for only one key (without hue), then similar to the linked thread, loop the subplots, create a twinx, and lineplot the second key/station combo:

#plotting
fg = sns.relplot(data=df[df['key']=='x'], x="index", y="station_value", kind="line", row="station")

for station, ax in fg.axes_dict.items():  
    ax1 = ax.twinx()
    sns.lineplot(data=df[(df['key'] == 'y') & (df['station'] == station)], x='index', y='station_value', color='orange', ci=None, ax=ax1)
    ax1.set_ylabel('')

Output:

edited Feb 10 '22 at 19:00

answered Feb 10 '22 at 18:50

BigBen

46,229
7
24
40

1

Genius. I added `facet_kws={'sharey': False, 'sharex': True}` from answer below to allow the y scales to differ from site to site, as my real world data for 'x' in 'key' differs between stations. Added that and plot size arguments to the initial `relplot()` call. – Ndharwood Feb 10 '22 at 20:11
Just noticed that removing `'hue=..'` gets rid of the legend. Is there any way to add that back in? I have tried many ways so far, the [best](https://stackoverflow.com/questions/58931770/legend-in-for-loop-does-not-work-properly-and-just-shows-the-last-curve) of which doesn't work. – Ndharwood Feb 11 '22 at 17:21
I'm assuming you want all the labels in one legend, like demonstrated [here](https://stackoverflow.com/questions/5484922/secondary-axis-with-twinx-how-to-add-to-legend)? – BigBen Feb 11 '22 at 17:25
Yep, a single legend. Can't seem to get it to recognise more than 1 label for the legend (even outside the loop), which the methods rely on. Saving the `sns.lineplot` call as `fig` object and then calling `fig.legend` on that doesn't work, for example. – Ndharwood Feb 11 '22 at 18:36
1

It's probably easier to create the legend from scratch. Not sure how production-worthy this alternative is, but here goes. Add `label='x'` to the first `relplot`, and `label='y', legend=False` to the second `relplot`. Then `lines, labels = fg.fig.axes[0].get_legend_handles_labels()`, `lines2, labels2 = fg.fig.axes[-1].get_legend_handles_labels()`, `fg.fig.legend(lines+lines2, labels+labels2, loc='upper right')`. – BigBen Feb 11 '22 at 18:54
Excellent, thanks. Added `borderaxespad=3.5` to that last fig.legend call to shift my legend back inside the plot axis lines. – Ndharwood Feb 11 '22 at 19:24
1

Oh I'd just do `fg.axes[0,0].legend(lines+lines2, labels+labels2, loc='upper right')` then - add a legend to the Axes, not the figure. – BigBen Feb 11 '22 at 19:31

score 1 · Answer 2 · answered Feb 10 '22 at 18:53

1

Not what you asked for, but you could make a grid of relplots with different y-axes without changing your df shape

fg = sns.relplot(
    data=df,
    x = "index",
    y = "station_value",
    kind = "line",
    col = "key",
    row = "station",
    facet_kws={'sharey': False, 'sharex': True},
)

answered Feb 10 '22 at 18:53

mitoRibo

4,468
1
13
22

Still helpful, I used the `facet_kws` arg and may need to plot the vars separately like this down the line. – Ndharwood Feb 10 '22 at 20:12

Introducing a second y axis into a relplot() call with multiple plots

The Problem

Example Code

2 Answers2