1

I have below two dictionaries:

data1 = {'value-A': {'mean': 10.0, 'count': 100},
         'value-B': {'mean': 2.0, 'count': 150},
         'value-C': {'mean': 6.6, 'count': 220},
         'value-D': {'mean': 11.4, 'count': 200}}

data2 = {'value-A': {'mean': 20.0, 'count': 50},
         'value-B': {'mean': 6.0, 'count': 100},
         'value-C': {'mean': 18.6, 'count': 150},
         'value-D': {'mean': 30.4, 'count': 120}}

I have below questions:

  1. How can I plot a bar graph for single dictionary?
  2. How can plot a comparison bar graph for mean values from two different dictionary like above? I am looking something like below snapshot enter image description here
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158

2 Answers2

0

You can use pyplot.bar to make a bar plot. You can check the documentation for more details about it, but you need the location of the bars (in your case, the value_x_mean variables) and the height of the bars (value_x_count). I suggest renaming your parameters inside the dictionary to make iterating through it easier:

data = {
"value-A" : {
    "value-a-mean" : 10.0,
    "value-a-count" : 100 
},
"value-B" : {
    "value-b-mean" : 2.0,
    "value-b-count" : 150 
},
"value-C" : {
    "value-c-mean" : 6.6,
    "value-c-count" : 220 
},
"value-D" : {
    "value-d-mean" : 11.4,
    "value-d-count" : 200 
    }
} 


for key, value in data.items():
    for sub_key, sub_value in value.items():
        # When you do 'value_a_mean'.split('-') you get ['value', 'a', 'mean']
        # We take the last item from this list [-1], so just 'mean' or 'count'
        value[sub_key.split('-')[-1]] = value[sub_key]
        del value[sub_key]

Now data will look like this:

{
    "value-A" : {
        "mean" : 10.0,
        "count" : 100 
    },
    "value-B" : {
        "mean" : 2.0,
        "count" : 150 
    },
    "value-C" : {
        "mean" : 6.6,
        "count" : 220 
    },
    "value-D" : {
        "mean" : 11.4,
        "count" : 200 
    }
} 

Then you can do:

from matplotlib import pyplot

data = {
    "value-A" : {
        "mean" : 10.0,
        "count" : 100 
    },
    "value-B" : {
        "mean" : 2.0,
        "count" : 150 
    },
    "value-C" : {
        "mean" : 6.6,
        "count" : 220 
    },
    "value-D" : {
        "mean" : 11.4,
        "count" : 200 
    }
} 

means, counts, labels = [], [], []

# Read the dictionary and save values
for key, value in data.items():
    labels.append(key)
    means.append(value['mean'])
    counts.append(value['count'])

# Bar plot
pyplot.bar(means, counts, tick_label=labels)
pyplot.show()

Result:

enter image description here

You can do this for as many other dictionaries as you want and the bar plots will show up together. You can use a legend to label them:

# Random 2nd dictionary
data2 = {
    "value-A" : {
        "mean" : 20.0,
        "count" : 100 
    },
    "value-B" : {
        "mean" : 3.0,
        "count" : 150 
    },
    "value-C" : {
        "mean" : 5.6,
        "count" : 220 
    },
    "value-D" : {
        "mean" : 8.4,
        "count" : 200 
    }
} 

means, counts, labels = [], [], []

for key, value in data.items():
    means.append(value['mean'])
    counts.append(value['count'])

pyplot.bar(means, counts, label="Dataset 1")

means, counts, labels = [], [], []

for key, value in data2.items():
    means.append(value['mean'])
    counts.append(value['count'])

pyplot.bar(means, counts, label="Dataset 2")
pyplot.legend()
pyplot.show()

Result:

enter image description here

0
  • Given nested dictionaries, or dictionaries in general, it is typically easier to use pandas, and then to plot with seaborn or pandas.DataFrame.plot.
    • seaborn is a high-level API for matplotlib
    • pandas.DataFrame.plot uses matplotlib as the default backend.
    • The shape of the dataframe, and the desired plot, should determine which plotting API to use.
  • Also see How to plot a bar plot from a nested dictionary, which plots directly with pandas.DataFrame.plot.
import pandas as pd
import seaborn as sns

# create a dataframe with a unique column, data, to identify which dictionary
df = pd.concat([pd.DataFrame(d).assign(data=i) for i, d in enumerate([data1, data2], start=1)], ignore_index=False)

# reset the index to be a column and rename it
df = df.reset_index().rename({'index': 'category'}, axis=1)

# convert the dataframe to a long for to work more easily with seaborn
dfm = df.melt(id_vars=['category', 'data'])

# plot both mean and count
g = sns.catplot(data=dfm, kind='bar', x='variable', y='value', hue='data', col='category')

# plot just one category by selecting the data value for that column and then plotting
data = dfm[dfm['category'].eq('mean')]
g = sns.catplot(data=data, kind='bar', x='variable', y='value', hue='data', col='category')

enter image description here

enter image description here

df

  category  value-A  value-B  value-C  value-D  data
0     mean     10.0      2.0      6.6     11.4     1
1    count    100.0    150.0    220.0    200.0     1
2     mean     20.0      6.0     18.6     30.4     2
3    count     50.0    100.0    150.0    120.0     2

dfm

   category  data variable  value
0      mean     1  value-A   10.0
1     count     1  value-A  100.0
2      mean     2  value-A   20.0
3     count     2  value-A   50.0
4      mean     1  value-B    2.0
5     count     1  value-B  150.0
6      mean     2  value-B    6.0
7     count     2  value-B  100.0
8      mean     1  value-C    6.6
9     count     1  value-C  220.0
10     mean     2  value-C   18.6
11    count     2  value-C  150.0
12     mean     1  value-D   11.4
13    count     1  value-D  200.0
14     mean     2  value-D   30.4
15    count     2  value-D  120.0
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158