1

I have the following data that I am trying to plot.

month      year       total_sales
May         2020        7
June        2020        2  
July        2020        1
August      2020        2
September   2020        22 
October     2020       11
November    2020        6
December    2020        3
January     2019        3
feburary    2019        11
March       2019        65 
April       2019        22
May         2019        33
June        2019        88
July        2019        44
August      2019        12
September   2019        32
October     2019        54
November    2019        76
December    2019        23
January     2018        12
feburary    2018        32
March       2018        234
April       2018        2432
May         2018        432

Here is the code I am using to do it:

def plot_timeline_data(df):
    fig, ax = plt.subplots()

    ax.set_xticklabels(df['month'].unique(), rotation=90)

    for name, group in df.groupby('year'):
        ax.plot(group['month'], group['total_sales'], label=name,linestyle='--', marker='o')

    ax.legend()
    plt.tight_layout()
    plt.show()

I want the order of x labels to start from january to December but my graph is starting with May to December and then resume from Jan to April as shown in the figure ( exact values of the graph are different as I changed the values). How can I put this in correct order?

enter image description here

blastoise
  • 109
  • 11

3 Answers3

1

You can use the following method. The idea is to sort the month column as shown in this and this post

# Capitalize the month names
df["month"] = df["month"].str.capitalize()

# Correct the spelling of February
df['month'] = df['month'].str.replace('Feburary','February')

# Convert to datetime object for sorting
df['month_in'] = pd.DatetimeIndex(pd.to_datetime(df['month'], format='%B')).month

# Sort using the index
df = df.set_index('month_in').sort_index()

plot_timeline_data(df)

enter image description here

Sheldore
  • 37,862
  • 7
  • 57
  • 71
0

I think you need to change the order of your 'month' index in the pandas dataframe. try adding:

group['month'] = group['month'][8:] + group['month'][:8]

before unsing the for-loop to plot the years

Teo Cherici
  • 121
  • 5
0

Dataframe.plot makes the job a bit easier for you - it plot each series as a different line, and keep the order you perscribe:

import matplotlib.pyplot as plt

# Convert the dataframe to series of years
df = df.set_index(["month","year"])["total_sales"].unstack()
# Sort the index (which is month)
df = df.loc[[
    "January","feburary","March","April","May","June",
    "July", "August", "September","October", "November", "December"
]]
# Plot!
df.plot(marker="o", linestyle="--", rot=90)
# Show all ticks
plt.xticks(range(12), df.index)

enter image description here

tmrlvi
  • 2,235
  • 17
  • 35