1

So I have data that I transformed up to this point (pic below). How can I now subplot histograms that will show me the programming languages used per job? I tried with just 2 columns at first:

px.histogram(languages_job_title_grouped, x =['Python','SQL'], facet_col = 'Role', facet_col_wrap = 4,height = 1000)

But it didn't work - it plots histogram by job, but the bars are the same for every role (2nd picture below). How can I do it the right way?

enter image description here

enter image description here

  • can you include the sample of your dataframe as formatted text instead of an image (see [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples))? you can copy and paste the output from languages_job_title_grouped.head(10).to_dict() into your question – thank you! – Derek O Dec 07 '22 at 18:49
  • @DerekO Sure, there you go: {'Python': {'Business Analyst': 0.29906542056074764, 'DBA/Database Engineer': 0.2465753424657534, 'Data Analyst': 0.299390243902439, 'Data Engineer': 0.291497975708502, 'Data Scientist': 0.3303467903446822, 'Not employed': 0.3525641025641026 }} Had to make it shorter to fit the comment size constraint; other keys like 'Python' will be ['SQL,'None/NA','R', 'Javascript','Java'] – Sebastian Kaczmarczyk Dec 07 '22 at 18:59

1 Answers1

2

From the context of your question, it seems like you are looking for a bar plot instead.

I.e. If I understand correctly, you are starting from a dataframe equivalent to

example_df

and trying to plot bar_plot

where the facets are the index, the x-axis is each column, and the bar heights are the values in the dataframe.

The code that generates this is:

import pandas as pd
import plotly.express as px

df = pd.DataFrame(
    [[0.1, 0.3, 0.5], [0.2, 0.1, 0.8], [0.5, 0.3, 0.9]],
    columns=["a", "b", "c"],
    index=["index1", "index2", "index3"],
)

px.bar(
    df.melt(ignore_index=False).reset_index(),
    facet_col="index",
    x="variable",
    y="value",
    barmode="group",
)

The key being to reformat your DataFrame using melt before trying to plot with plotly.express.

Tim Child
  • 339
  • 1
  • 11