I'm new to plotly, and to pandas, and I am attempting to find an elegant solution since I believe I was either not using groupby efficiently in plotly or maybe my data is somehow stacked which is preventing me from visualizing it.
To make a test chart I used a fake data set by zipping together 3 lists (group, month, spend), and split it into "actual" and "forecast" values after a specific month (Mar' 20).
When I attempted to add a trace of the forecast df which contains 3 different groups across several months I got the monster below.
When I changed the index to the group and then used loc to subset into 3 separate sets (one per group), I succeeded in making the following chart, albeit it feels like a frankenstein solution:
I'm wondering if there is either a way to have plotly chart the initial dataframe and change line style after a certain point in the x axis, and if not, if there is a way to use trace on a subset of the data which contains three different groups (group1, group2, group3)? I am not sure if using three separate traces and splitting the data over and over is the best solution, and I am confident that there is a more efficient one.
Here is how I currently get the separate groups:
# reset index
forecast = forecast.set_index(['group'])
#split
group1_forecast =forecast.loc['group1']
group2_forecast = forecast.loc['group2']
group3_forecast = forecast.loc['group3']
Here is the (minimum) code for the chart with the separate traces:
fig = None
fig = px.line(actual,
x="month", y="spend", color='group',
title=title)
# group1
fig.add_scatter(
x= group1_forecast.month,
y = group1_forecast.spend,
mode = 'lines',
line = dict(shape = 'linear', color = 'purple', width = 1, dash = 'dot'),
connectgaps = True
)
# group2 trace
fig.add_scatter(
x= group2_forecast.month,
y = group2_forecast.spend,
mode = 'lines',
line = dict(shape = 'linear', color = '#33C1FF', width = 1, dash = 'dot'),
connectgaps = True
)
# group3 trace
fig.add_scatter(
x= group3_forecast.month,
y = group3_forecast.spend,
mode = 'lines',
line = dict(shape = 'linear', color = '#FFDD33', width = 1, dash = 'dot'),
connectgaps = True
)
fig.show()
Here's the data:
months = ["Mar '19", "Mar '19", "Mar '19",
"Apr '19", "Apr '19", "Apr '19",
"May '19", "May '19", "May '19",
"Jun '19", "Jun '19", "Jun '19",
"Jul '19", "Jul '19", "Jul '19",
"Aug '19", "Aug '19", "Aug '19",
"Sep '19", "Sep '19", "Sep '19",
"Oct '19", "Oct '19", "Oct '19",
"Nov '19", "Nov '19", "Nov '19",
"Dec '19", "Dec '19", "Dec '19",
"Jan '20", "Jan '20", "Jan '20",
"Feb '20", "Feb '20", "Feb '20",
"Mar '20", "Mar '20", "Mar '20",
"Apr '20", "Apr '20", "Apr '20",
"May '20", "May '20", "May '20",
"Jun '20", "Jun '20", "Jun '20",
"Jul '20", "Jul '20", "Jul '20",
"Aug '20", "Aug '20", "Aug '20",
"Sep '20", "Sep '20", "Sep '20"]
groups = ['group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3',
'group1', 'group2', 'group3']
spend = [57, 150, 75,
61.5, 156, 78,
66, 150, 75,
63, 162, 81,
69, 163.5, 81.75,
76.5, 162, 81,
78, 168, 84,
79.5, 168, 84,
84, 162, 81,
87, 169.5, 84.75,
93, 171, 85.5,
96, 169.5, 84.75,
97.5, 168, 84,
97.9, 167.7, 84.5,
98.4, 167.9, 85.1,
99.9, 168.1, 85.7,
100.9, 168, 86.1,
101.6, 168.4, 86.3,
102.7, 168.8, 86.9]
spend_by_group_list = list(zip(months, groups, spend))
spend_df = pd.DataFrame(spend_by_group_list, columns = ['month', 'group', 'spend'])