0

I am having trouble creating subplots in plotly given a pandas data frame with multiple colors.

Here is an incomplete example creating the individual plots:

import plotly

df = plotly.express.data.iris()

plot1 = plotly.express.scatter(df, x="sepal_width", y="sepal_length", color="species")
plot2 = plotly.express.scatter(df, x="petal_width", y="petal_length", color="species")

plot1.show(), plot2.show()

From what I read, something like this makes sense but does not work:

import plotly

df = plotly.express.data.iris()

plot1 = plotly.express.scatter(df, x="sepal_width", y="sepal_length", color="species")
plot2 = plotly.express.scatter(df, x="petal_width", y="petal_length", color="species")

fig = plotly.subplots.make_subplots(rows=1, cols=2)

fig.append_trace(plot1, row=1, col=1)
fig.append_trace(plot2, row=1, col=2)

fig.show()

Looking into this, others seem to resolve this issue with a similar setup:

https://stackoverflow.com/a/65555470/4700548

r-beginners gives a good example below of how to make multiple subplots with one color, but this causes performance issues with many colors.

What is it that I am missing in these examples?

Edit: Added color to example. Added incomplete example.

Joseph
  • 431
  • 1
  • 4
  • 15

2 Answers2

2

Why it can't be done as per the linked example is probably because px and go have different internal data. I have not confirmed this. The point I modified is that the data can be specified in this format as plotl1.data[0] to get the scatterplot data.

import plotly.express as px
from plotly.subplots import make_subplots

df = px.data.iris()

plot1 = px.scatter(df, x="sepal_width", y="sepal_length", color="species")
plot2 = px.scatter(df, x="petal_width", y="petal_length", color="species")

#print(plot1['data'][0],plot1['data'][1],plot1['data'][2], end='\n')

fig = make_subplots(rows=1, cols=2, specs=[[{"type": "scatter"}, {"type": "scatter"}]])

fig.append_trace(plot1['data'][0], row=1, col=1)
fig.append_trace(plot1['data'][1], row=1, col=1)
fig.append_trace(plot1['data'][2], row=1, col=1)
fig.append_trace(plot2['data'][0], row=1, col=2)
fig.append_trace(plot2['data'][1], row=1, col=2)
fig.append_trace(plot2['data'][2], row=1, col=2)

fig.update_layout(showlegend=False)
fig.show()

enter image description here

r-beginners
  • 31,170
  • 3
  • 14
  • 32
  • Sorry, color was missing from my problem statement before. You solution seems to only work if there is one color category. Is there a way to generalize this result? – Joseph May 05 '22 at 14:57
  • In this case, adding color creates three scatter plots, which are then added to each of the subplots. – r-beginners May 06 '22 at 06:08
  • If there are m figures and n colors, is it possible to make O(m) or O(n) traces? Having O(m*n) traces seems problematic for performance… – Joseph May 06 '22 at 06:49
  • The number of figures is determined by row and col. If you need n colors for one graph, just repeat the same place as in the code. Depending on the number of graphs, this may affect performance. – r-beginners May 06 '22 at 06:58
0

I think my example was rather poor, but r-beginners gives the best solution as the question is posed.

The data I am working with has the same x and y axis labels. So, if there are n data frames, you can take advantage of that fact by adding a facet_col and concatenating them.

embed_df = pd.concat([
    df1,
    df2,
    df3,
    df4,
    ....
])

fig = plotly.express.scatter(
    embed_df,
    x = "dim1",
    y = "dim2",
    color="label",
    facet_col="name")
Joseph
  • 431
  • 1
  • 4
  • 15