1

I have a dataframe as follows. The values are in a cell, a list of elements. I want to visualize distribution of the values from the "Values" column using histogram"S" stacked in rows OR separated by colours (Area_code).

How can I get the values and construct histogram"S" in plotly? Any other idea also welcome. Thank you.

    Area_code   Values
0   New_York    [999, 54, 231, 43, 177, 313, 212, 279, 199, 267]
1   Dallas  [915, 183, 2326, 316, 206, 31, 317, 26, 31, 56, 316]
2   XXX     [560]
3   YYY     [884, 13]
4   ZZZ     [203, 1066, 453, 266, 160, 109, 45, 627, 83, 685, 120, 410, 151, 33, 618, 164, 496]
vestland
  • 55,229
  • 37
  • 187
  • 305
Droid-Bird
  • 1,417
  • 5
  • 19
  • 43
  • It would be helpful if you provided an expected outcome/output of the graph, and I believe you can also refer to this: https://plotly.com/python/histograms/#stacked-histograms – AS11 Apr 13 '21 at 13:45

1 Answers1

0

If you reshape your data, this would be a perfect case for px.histogram. And from there you can opt between several outputs like sum, average, count through the histfunc method:

fig = px.histogram(df, x = 'Area_code', y = 'Values', histfunc='sum')
fig.show()

You haven't specified what kind of output you're aiming for, but I'll leave it up to you to change the argument for histfunc and see which option suits your needs best.

enter image description here

I'm often inclined to urge users to rethink their entire data process, but I'm just going to assume that there are good reasons why you're stuck with what seems like a pretty weird setup in your dataframe. The snippet below contains a complete data munginge process to reshape your data from your setup, to a so-called long format:

   Area_code  Values
0   New_York     999
1   New_York      54
2   New_York     231
3   New_York      43
4   New_York     177
5   New_York     313
6   New_York     212
7   New_York     279
8   New_York     199
9   New_York     267
10    Dallas     915
11    Dallas     183
12    Dallas    2326
13    Dallas     316
14    Dallas     206
15    Dallas      31
16    Dallas     317
17    Dallas      26
18    Dallas      31
19    Dallas      56
20    Dallas     316
21       XXX     560
22       YYY     884
23       YYY      13
24       ZZZ     203

And this is a perfect format for many of the great functionalites of plotly.express.

Complete code:

import plotly.graph_objects as go
import plotly.express as px
import pandas as pd

# data input
df = pd.DataFrame({'Area_code': {0: 'New_York', 1: 'Dallas', 2: 'XXX', 3: 'YYY', 4: 'ZZZ'},
                 'Values': {0: [999, 54, 231, 43, 177, 313, 212, 279, 199, 267],
                  1: [915, 183, 2326, 316, 206, 31, 317, 26, 31, 56, 316],
                  2: [560],
                  3: [884, 13],
                  4: [203, 1066, 453, 266, 160, 109, 45, 627, 83, 685, 120, 410, 151, 33, 618, 164, 496]}})

# data munging
areas = []
value = []
for i, row in df.iterrows():
#     print(row['Values'])
        for j, val in enumerate(row['Values']):
            areas.append(row['Area_code'])
            value.append(val)
df = pd.DataFrame({'Area_code': areas,
                   'Values': value})

# plotly
fig = px.histogram(df, x = 'Area_code', y = 'Values', histfunc='sum')
fig.show()
vestland
  • 55,229
  • 37
  • 187
  • 305
  • Excellent! Thank you @vestland I have used the code like this, # plotly fig = px.histogram(df, x = 'Values', color= 'Area_code', histfunc='avg', opacity=.4) fig.show() It gives one correct output. However, is it possible to use fig = px.histogram( ... ) like they did in here, https://plotly.com/python/histograms/#custom-binning – Droid-Bird Apr 14 '21 at 09:34
  • @Droid-Bird Not sure what you mean... Perhaps you could explain things a bit closer in a new question? – vestland Apr 14 '21 at 12:36
  • @Droid-Bird The approach they're using under your provided link is `go.Histogram` and not a `px.histogram`. If that's what you meant in your previous comment then yes, I could set up an example with that too. – vestland Apr 14 '21 at 13:09
  • Yes, you have got my request right -- the output should look like the go.Histogram as they did. But be able to use your previous code with px.histogram. Thanks a million. – Droid-Bird Apr 15 '21 at 08:35