1

I'm trying to create 2 plots with a shared x-axis and I'm having 2 problems with this:

  1. As soon as I customise the layout with yaxis and yaxis2 titles and/or tickmarks, y-axes begin to overlap
  2. I would like the legends to be shared between the 2 plots, but instead they are duplicated

Here is the code to reproduce the problem I'm experiencing:

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True) # using jupyter
import plotly.graph_objs as go
from plotly import tools
import numpy as np

 N = 100
epoch_range = [i for i in range(N)]
model_perf = {}
for m in ['acc','loss']:
    for sub in ['train','validation']:
        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)
        model_perf[history_target] = np.random.random(N)

line_type = {
    'train': dict(
        color='grey',
        width=1,
        dash='dash'
    ),
    'validation': dict(
        color='blue',
        width=4
    )
}

fig = tools.make_subplots(rows=2, cols=1, shared_xaxes=True, shared_yaxes=False, specs = [[{'b':10000}], [{'b':10000}]])
i = 0
for m in ['acc','loss']:

    i += 1

    for sub in ['train','validation']:

        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        fig.append_trace({
            'x': epoch_range,
            'y': model_perf[history_target],
            #'type': 'scatter',
            'name': sub,
            'legendgroup': m,
            'yaxis': dict(title=m),
            'line': line_type[sub],
            'showlegend': True
        }, i, 1)

fig['layout'].update(
    height=600, 
    width=800, 
    xaxis = dict(title = 'Epoch'),
    yaxis = dict(title='Accuracy', tickformat=".0%"),
    yaxis2 = dict(title='Loss', tickformat=".0%"),
    title='Performance'
)
iplot(fig)  

And here is the image that I get: Output

If you have any suggestions on how to solve these 2 problems, I'd love to hear from you.

Manny thanks in advance!

EDIT:

Following Farbice's advice, I looked into the create_facet_grid function from plotly.figure_factory (which by the way requires plotly 2.0.12+), I did manage to reproduce the same image with fewer lines but it gave me less flexibility -- for example I don't think you can plot lines using this function and it also has the legend duplication issue, but if you are looking for an ad hoc viz, this might be quite effective. It requires data in a long format, see the below example:

# converting into the long format
import pandas as pd
perf_df = (
    pd.DataFrame({
        'accuracy_train': model_perf['acc'],
        'accuracy_validation': model_perf['val_acc'],
        'loss_train': model_perf['loss'],
        'loss_validation': model_perf['val_loss']
    })
    .stack()
    .reset_index()
    .rename(columns={
        'level_0': 'epoch',
        'level_1': 'variable',
        0: 'value'
    })
)

perf_df = pd.concat(
    [
        perf_df,
        perf_df['variable']
        .str
        .extractall(r'(?P<metric>^.*)_(?P<set>.*$)')
        .reset_index()[['metric','set']]   
    ], axis=1
).drop(['variable'], axis=1)

perf_df.head() # result

epoch  value     metric     set
0      0.434349  accuracy   train
0      0.374607  accuracy   validation
0      0.864698  loss       train
0      0.007445  loss       validation
1      0.553727  accuracy   train

# plot it
fig = ff.create_facet_grid(
    perf_df,
    x='epoch',
    y='value',
    facet_row='metric',
    color_name='set',
    scales='free_y',
    ggplot2=True
)

fig['layout'].update(
    height=800, 
    width=1000, 
    yaxis1 = dict(tickformat=".0%"),
    yaxis2 = dict(tickformat=".0%"),
    title='Performance'
)

iplot(fig)

And here is the result: create_facet_grid

IVR
  • 1,718
  • 2
  • 23
  • 41

2 Answers2

1

After doing a little more digging I've found the solution to both my problems.

First, the overlapping y-axis problem was caused by yaxis argument in the layout update, it had to be changed to yaxis1.

The second problem with duplications in the legend was a little trickier, but this post helped me work it out. The idea is that each trace can have a legend associated with it, so if you are plotting multiple traces, you may only want to use the legend from one of them (using the showlegend argument), but to make sure that one legend controls the toggle of multiple subplots, you can use the legendgroup parameter.

Here is the full code with the solution:

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True) # using jupyter
import plotly.graph_objs as go
from plotly import tools
import numpy as np

N = 100
epoch_range = [i for i in range(N)]
model_perf = {}
for m in ['acc','loss']:
    for sub in ['train','validation']:
        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        model_perf[history_target] = np.random.random(N)

line_type = {
    'train': dict(
        color='grey',
        width=1,
        dash='dash'
    ),
    'validation': dict(
        color='blue',
        width=4
    )
}

fig = tools.make_subplots(
    rows=2, 
    cols=1, 
    shared_xaxes=True, 
    shared_yaxes=False
)

i = 0
for m in ['acc','loss']:

    i += 1

    if m == 'acc':
        legend_display = True
    else:
        legend_display = False

    for sub in ['train','validation']:

        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        fig.append_trace({
            'x': epoch_range,
            'y': model_perf[history_target],
            'name': sub,
            'legendgroup': sub, # toggle train / test group on all subplots
            'yaxis': dict(title=m),
            'line': line_type[sub],
            'showlegend': legend_display # this is now dependent on the trace
        }, i, 1)

fig['layout'].update(
    height=600, 
    width=800, 
    xaxis = dict(title = 'Epoch'),
    yaxis1 = dict(title='Accuracy', tickformat=".0%"),
    yaxis2 = dict(title='Loss', tickformat=".0%"),
    title='Performance'
)
iplot(fig)  

And here is the result:

enter image description here

IVR
  • 1,718
  • 2
  • 23
  • 41
  • actually, if you want the x-axis title to appear under the bottom facet, replace `xaxis` with `xaxis1` in the layout update chunk. – IVR Dec 24 '17 at 03:16
0

In my experience, visualization-tools prefer a long-format of data. You might want to adapt your data to a table with columns like:

  • epoch
  • variable : 'acc' or 'loss'
  • set: 'validation' or 'train'
  • value : the value for the given epoch/variable/set

By doing this you might find it easier to create the graph you desire by using facetting on 'variable' with the 'set'-traces having x=epoch,y=value

If you'd a coded solution, please provide some data.

Hope this was helpful.

  • Thanks for the suggestion, I'm used to working with ggplot which does prefer data in the format you described, however, plotly appears to be indifferent to the underlying structure of your data, the structure is derived from the declared traces, so if you have any recommendations on how to fix the issues I described using any data format, then I'd like to learn more. Cheers – IVR Dec 22 '17 at 21:54
  • I started with ggplot as well but now I'm using python instead of R and the create_facet_grid function of plotly acts kind of the same way ggplot does in this option. – Fabrice Deseyn Dec 22 '17 at 21:57
  • There might indeed be another solution to this but they always told me : 'first make it work, then make it better'. At the moment, it's the only thing that I can think of. – Fabrice Deseyn Dec 22 '17 at 21:58
  • Thanks, I'll try create_facet_grid and will let you know how it goes! – IVR Dec 22 '17 at 22:02