0

I am using the following code to plot a pretty large dataset:

import matplotlib.pyplot as plt    
label_order = ['a', 'b']
df_sorted = df_unsorted.sort_values(by='cet_hour')
df_sorted.groupby(['cet_hour', 'col_y']).size().unstack()[label_order].plot(kind='bar', stacked=True)
plt.xlabel('Hours of the Day')
plt.ticklabel_format(style='plain', axis='y')
plt.show()

Most of the code runs fine, but the hours of the day are not in order. That is, the first two bars are of hours 0 and 1 and then 10 to 19 and then 2 in the middle and so on. The data does not have an issue as it starts from 6 am on a specific day and ends at 6 pm on another day after a few weeks. It runs consistently without any breaks. I have applied the same code to days of the week, where I get sorted x_axis from Monday to Sunday. What could be the problem?

Abir
  • 57
  • 5
  • If I had to guess, I would say your hour data are strings and not numerical. That would explain their sorting. I suggest providing a [minimal, complete, and reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). Please also read [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – Mr. T Mar 21 '22 at 09:29
  • 1
    Thank you for your comment. I changed the datatype to 'int' from 'object' and it sorted it perfectly. – Abir Mar 21 '22 at 10:56

0 Answers0