2

I have a nested dict that looks like this:

mydict = {
 'A': {'apples': 3, 'bananas': 5, 'kiwis': 9, 'oranges': 6},
 'B': {'apples': 1, 'bananas': 9, 'kiwis': 1, 'oranges': 3},
 'C': {'apples': 6, 'bananas': 9, 'kiwis': 3, 'oranges': 3},
}

A,B,C are the group labels. Apples, bananas, kiwis, and oranges are counts within each group. I'd like to plot a grouped vertical bar chart using matplotlib. It would have a legend with three colors for apples, bananas and oranges.

I am only able to plot using DataFrame.plot method:

pd.DataFrame(mydict).T.plot(kind='bar')

enter image description here

I want to be able to plot the same using matplotlib, so I can manage figure size, and change the size of the bars etc.

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
user1189851
  • 4,861
  • 15
  • 47
  • 69

3 Answers3

1

The documentation of pandas says that pandas.DataFrame.plot() returns a matplotlib.axes.Axes object. So, basically, you can handle it in the same way you will handle the plot aspects using matplotlib.

So, using your example:

# Import libraries 
import pandas as pd
import matplotlib.pyplot as plt

# Create dictionary
plot_dict = {'A': {'Apples': 3,'Bananas':5,'Oranges':6,'Kiwis':9}, 'B': {'Apples': 1,'Bananas':9,'Oranges':3,'Kiwis':1}, 'C': {'Apples': 6,'Bananas':9,'Oranges':3,'Kiwis':3}}

# Plot using pandas built-in function plot()
ax = pd.DataFrame(plot_dict).T.plot.bar(zorder=5)

# Define aspects of the plot using matplotlib
ax.set_ylabel("Quantity")
ax.set_xlabel("Category")
ax.grid(axis='y', color='black', linestyle='-', linewidth=0.3)
ax.legend(loc='lower center', bbox_to_anchor=(0.5, -0.25), ncol=4, edgecolor='1', fontsize=10)
ax.locator_params(axis='y', nbins=12)

plt.savefig(f'./plot_from_nested_dict.svg', bbox_inches='tight')
0

Firstly, you can manage figsize with the figsize argument or store the axes that are returned by the .plot method on the dataframe, so a pure matplotlib solution isn't the only way to go.

Having said that... The important takeaway for learning grouped bars in matplotlib is to have an offset. Each set of grouped bars (e.g. apple) needs to be offset from the xticks by a function of width (e.g. width * 2)

d = {"A": {...}, ...}

import matplotlib.pyplot as plt
import numpy as np

# Get the ax object and hardcode a bar width
fig, ax = plt.subplots()

width = 0.05

# Each set of bars (e.g. "bananas") is offset from the xtick so they're grouped
# For example np.arange(3) - width*2 for an offset of negative two bar widths 
ax.bar(np.arange(3) - width*2, [d[j]["apples"] for j in d], width)
ax.bar(np.arange(3) - width, [d[j]["bananas"] for j in d], width)
ax.bar(np.arange(3), [d[j]["oranges"] for j in d], width)
ax.bar(np.arange(3) + width, [d[j]["kiwis"] for j in d], width)

# Labels
ax.set_xticks(np.arange(3))
ax.set_xticklabels(["A", "B", "C"])
Charles Landau
  • 4,187
  • 1
  • 8
  • 24
0
  • I want to be able to plot the same using matplotlib so I can manage figure size and change the size of the bars etc.
    • pandas.DataFrame.plot uses matplotlib as the default backend, and returns a matplotlib.axes.Axes, so matplotlib formatting methods work without issue.
      • Review the pandas.DataFrame.plot documentation for all of the formatting parameters
        • figsize, width, title, ylabel, xlabel, grid, subplots, and many more.
      • .legend and other methods are called against ax.
  • The most direct way to create a DataFrame from the dict, with the correct orientation for the plot, is to use pandas.DataFrame.from_dict with orient='index'.
    • Whichever values are in the index, will be on the x-axis, and the column headers will be the color labels.
    • For more complext dict or json data, see answers using .json_normalize under .
  • For additional details about bar label annotations, see How to add value labels on a bar chart and How to plot and annotate a grouped bar chart
  • Grouped bar chart with labels, in the matplotlib documentation, shows a purely matplotlib implementation (without pandas).
  • Tested in python 3.11.2, pandas 2.0.1, matplotlib 3.7.1
import pandas as pd

d = {'A': {'apples': 3, 'bananas': 5, 'oranges': 6, 'kiwis': 9},
     'B': {'apples': 1, 'bananas': 9, 'oranges': 3, 'kiwis': 1},
     'C': {'apples': 6, 'bananas': 9, 'oranges': 3, 'kiwis': 3}}

# with pandas
df = pd.DataFrame.from_dict(d, orient='index')

# custom colors, if desired
color = dict(zip(df.columns, ['green', 'yellow', 'orange', 'tan']))

# plot
ax = df.plot(kind='bar', rot=0, color=color, title='With pandas', width=0.85,
             figsize=(8, 5), xlabel='Category', ylabel='Quantity')

# cosmetic formatting
ax.legend(bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
ax.spines[['right', 'top']].set_visible(False)

# iterate through the containers associated with the axes (ax)
for c in ax.containers:
    ax.bar_label(c, fmt=lambda x: f'{x:0.0f}' if x > 0 else '')

width=0.85

enter image description here

Default width

enter image description here

df

   apples  bananas  oranges  kiwis
A       3        5        6      9
B       1        9        3      1
C       6        9        3      3
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158