1

I am able to render a line chart by using my plotting data but x-axis tickers is not showing correctly. Because my dataframe has period datetimeindex object and I want to show them along x axis correctly. I tried several existing posts about axis ticker but still I didn't have a correct plot. How to fix this? any idea? thanks

EDA data

here is the plotting data on gist

my attempt:

here is my current attempt:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

df = pd.read_csv('plot_data.csv', encoding='utf-8')
df.div(df.Total, axis=0).applymap(lambda x: f'{x * 100:.2f}%')
fig, ax1 = plt.subplots(figsize=(14,6))
_ = df.div(df.Total, axis=0).iloc[:, :-1].plot(kind='line', ax=ax1, ax=ax1, marker='o',ls='--')
ax1.yaxis.set_major_formatter(FuncFormatter(lambda y, _: '{:.0%}'.format(y))) 
ax1.xaxis.set_major_locator(mdates.DayLocator())
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%m-%d-%Y'))
plt.show()

goal

I want to render a line chart where the y-axis should show percentage while the x-axis should show periods along the years correctly. In my code, x-axis tickers are not showing correctly. Any idea?

beyond_inifinity
  • 443
  • 13
  • 29
  • Just curious why you want to show each day when your data is by quarter? It looks pretty good if it's just showing the years as the x-axis tick labels. – mechanical_meat Feb 04 '20 at 18:16
  • @VorsprungdurchTechnik not day but periods such as `2014-01-01, 2014-04-01` and so on. I am trying to show by quarter. any idea? why ticking is not working here? – beyond_inifinity Feb 04 '20 at 18:17
  • I can't make heads nor tails of what's going on with the DateFormatter. Maybe KT12 is on to something about there being a pandas issue with this. – mechanical_meat Feb 04 '20 at 19:03

2 Answers2

2

The easiest is to let pandas do its thing. To have dates in the x-axis, pandas likes to have these dates as index. Just do df.set_index('quarter', inplace=True).

With such an index, pandas will set an x-axis that looks like a date, but in reality is a categorical axis (numbered 0,1,2,3,...) where pandas provides the ticklabels.

To set the percentages, use the PercentFormatter, with parameters to set the 100% (to 1, not to the default 100) and the number of decimals.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as mtick

filename = 'plot_data.csv'
df = pd.read_csv(filename, encoding='utf-8')
df.set_index('quarter', inplace=True)
fig, ax1 = plt.subplots(figsize=(14, 6))
df.div(df.Total, axis=0).iloc[:, :-1].plot(kind='line', ax=ax1, marker='o', ls='--')
ax1.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1, decimals=0))
plt.xticks(range(len(df.index)), df.index, rotation=90)
plt.show()

resulting plot

Alternatively, you could convert the index to matplotlib dates and use matplotlib's formatting and locators:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as mtick

filename = 'plot_data.csv'
df = pd.read_csv(filename, encoding='utf-8')
df.quarter = [pd.to_datetime(d).date() for d in df.quarter]
df.set_index('quarter', inplace=True)
fig, ax1 = plt.subplots(figsize=(14, 6))
_ = df.div(df.Total, axis=0).iloc[:, :-1].plot(kind='line', ax=ax1, marker='o', ls='--')
ax1.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1, decimals=0))
ax1.xaxis.set_major_locator(mdates.MonthLocator(bymonthday=1, interval=3))
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%m-%d-%Y'))
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

plot using matplotlib dates

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • this plot is confusing because understanding the x-axis is not so easy. is that possible to make a plot like this [SO post](https://stackoverflow.com/questions/43968985/changing-the-formatting-of-a-datetime-axis-in-matplotlib)? any idea? – beyond_inifinity Feb 04 '20 at 22:27
1

The quarter column was converted to datetime format and then set as the index:

import matplotlib.dates as mdates

df = pd.read_csv('plot_data.csv', encoding='utf-8')
df['quarter'] = pd.to_datetime(df['quarter'], format='%Y-%m-%d')
df = df.set_index(df['quarter'])
df = df.sort_index()
fig, ax1 = plt.subplots(figsize=(14,6))
_ = df.drop('quarter', axis=1).div(df.Total, axis=0).iloc[:, :-1].plot(kind='line', ax=ax1)
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: '{:.0%}'.format(y))) 
ax1.set_xticks(df.index)
ax1.xaxis_date()
plt.show()

enter image description here

After struggling with matplotlib, I found a solution using seaborn.

import matplotlib.ticker as mtick
import seaborn as sns
sns.set()
df = pd.read_csv('plot_data.csv', encoding='utf-8')
df['quarter'] = pd.to_datetime(df['quarter'], format='%Y-%m-%d')
df = df.set_index(df['quarter'])
df = df.sort_index()
df_clean = df.drop('quarter', axis=1).div(df.Total, axis=0)
df_clean.drop('Total', axis=1, inplace=True)
df_us = df_clean.unstack().reset_index().copy()
df_us = df_us.rename(columns={'level_0':'Country', 0:'Percent'})
g = sns.lineplot(data=df_us, x='quarter', y='Percent', hue='Country')
g.set(xticks=df.index)
plt.xticks(rotation=30)
g.yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: '{:.0%}'.format(y)))
plt.savefig('sns.png')
plt.show()

enter image description here

KT12
  • 549
  • 11
  • 24