0

Please see the image below for the test dataset. I have dataset in the left side, then I used the following to get to the test_df:

test_df1=test_df.groupby(['year','country','won/loss']).sum()
x=test_df1.pivot_table(index='year',columns=['country','won/loss'],values='value')
x=x.fillna(0)

I would like to plot a two-levels of x-ticklabels stacked barplot, just like the one shown here (also shown in the image below): Plot two levels of x_ticklabels on a pandas multi-index dataframe

But I don't know how to convert test_df data to the form shown in the linked post, and also, the code

ax.tick_params(axis='both', which='both', length=0)

in that post doesn't seem working.

Is there a way to plot two-levels of x-ticklabels directly from test_df data? Or I do need to convert it to the format in the post mentioned above ?( how to do so?) And how to set x label, so text "country" and "year" in test_df dataset doesn't show up. X axis label is just 2010 (3 countries)/2011 (3 countries)/2012 (3 countries)?

Thanks a lot for your help.

enter image description here

xiaokai
  • 17
  • 2
  • If you are still interested in getting a solution with a MultiIndex, let me know what are the two levels for the x-axis that you want based on the bar chart I show below. I can then extend the sample data to create something like in the image and link you shared. – Patrick FitzGerald Feb 13 '21 at 09:43

1 Answers1

0

It is unclear what variable you want to use for stacking the bars, but in any case, I assume you are wanting to plot wins and losses side-by-side for each country. The sample data does not contain enough values for a MultiIndex dataframe to make sense. For example, using 'country' and 'year' on the x-axis and side-by-side bars for 'won/loss' values would lead to single bars standing next to inexistent bars for value = 0.

Therefore in the following example, the data is aggregated by year (which can be seen as 'stacking the data' before plotting) and shown for each country with the total sum of values for wins and losses side-by-side. This can be done by pivoting the dataframe differently than the way you have done it. Also note that .groupby([...]).sum() is not necessary.

import numpy as np   # v 1.19.2
import pandas as pd  # v 1.1.3

# Create sample dataset
rng = np.random.default_rng(seed=1)
country_names = ['Japan', 'Ireland', 'UK']
years_list = [2010, 2011, 2012]
years = np.tile(years_list, len(country_names))
countries = np.repeat(country_names, len(years_list))
won_loss = rng.choice(['won', 'loss'], size=countries.size)
values = rng.integers(1, 5, size=countries.size)

df = pd.DataFrame({'year': years,
                   'country': countries,
                   'won/loss': won_loss,
                   'value': values})
df

#     year  country  won/loss  value
#  0  2010    Japan       won      2
#  1  2011    Japan      loss      4
#  2  2012    Japan      loss      2
#  3  2010  Ireland      loss      2
#  4  2011  Ireland       won      4
#  5  2012  Ireland       won      2
#  6  2010       UK      loss      2
#  7  2011       UK      loss      3
#  8  2012       UK       won      3
df_pivot = df.pivot(index=['country', 'year'], columns='won/loss').fillna(0)
df_pivot

pivot

# Aggregate data by summing values of all years together
df_countries = df_pivot.groupby('country').sum()
df_countries.columns = df_countries.columns.droplevel(0)
df_countries

countries

# Create grouped bar chart of won/loss values aggregated by country
ax = df_countries.plot.bar(rot=0, zorder=2)

# Format grid and spines
ax.grid(axis='y', zorder=1, color='black', alpha=0.3)
for spine in ['top', 'left', 'right']:
    ax.spines[spine].set_visible(False)
ax.spines['bottom'].set_alpha(0.3)

# Format x-axis labels
ax.set_xlabel(ax.get_xlabel(), labelpad=10)
ax.tick_params(axis='both', which='major', length=0, pad=10)

# Place legend outside of plot area
ax.legend(*ax.get_legend_handles_labels(), frameon=False,
          bbox_to_anchor=(1, 0.5), loc="center left");

grouped bar chart

Patrick FitzGerald
  • 3,280
  • 2
  • 18
  • 30