0

I currently have a BoxPlot created in Plotly Express, as seen below

enter image description here

which uses the code:

import plotly.express as px

fig = px.box(df, x="issuer.operator.name", y="validity_period", points='all', log_y=True,
            width=2000, height=1000,
            template="simple_white")
                
fig.show()

However, I am trying to create the chart such that each box plot is shown in a different color based on the x-axis objects (i.e. Internet Sec Research Group is Blue, Sectigo is red, etc).

I know from this post that you can use the parameter color= '<column heading>' to choose how the graph is coloured. From the docs, the parameter color is

Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign color to marks.

However, when I try to run the code with the additional color parameter such that

import plotly.express as px

    fig = px.box(df, x="issuer.operator.name", y="validity_period", color="issuer.operator.name", points='all', log_y=True,
                width=2000, height=1000,
                template="simple_white")
                    
    fig.show()

I recieve the following error:

KeyError: (nan, '', '', '')

How would I go about changing each boxplot's color? Appreciate any help.

I'mahdi
  • 23,382
  • 5
  • 22
  • 30
Cloud9
  • 3
  • 3
  • 1
    @user1740577, appreciate the edit. Seems I couldn't add an image directly due to my low rep at the moment. – Cloud9 Aug 19 '21 at 14:37

1 Answers1

0
  • have simulated your data frame
  • this works as you require where there are NaNs in issuer.operator.name
  • simulated a NaN, then get same error. Resolved by dropna()
import plotly.express as px
import pandas as pd
import numpy as np
import random

df = pd.DataFrame({"issuer.operator.name":np.random.choice(["Internet Research", "Sectigo"],200),
                  "validity_period":np.random.uniform(1,100,200)*np.random.uniform(.001,1,200)})
df.iloc[random.randint(0,len(df)-1), 0] = np.nan
fig = px.box(df.dropna(), x="issuer.operator.name", y="validity_period", color="issuer.operator.name", points='all', log_y=True,
            width=700, height=1000,
            template="simple_white")
                
fig.show()
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • Thank you @Rob Raymond. This seems to have solved the issue as for some reason I didn't think I had any NaN's in the issuer.operator.name. Cheers – Cloud9 Aug 19 '21 at 17:23