1

I have the following dataset:

my_df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8],
                      'date':['2019-01-01 07:59:54','2019-01-01 08:00:07','2019-01-01 08:00:07',
                              '2019-01-02 08:00:14','2019-01-02 08:00:16','2019-01-02 08:00:24',
                              '2019-01-03 08:02:38','2019-01-03 08:50:14'],
                      'machine':['A','A','B','C','B','C','D','D'],
                      'group':['Grind','Grind','Weld','Grind','Weld','Grind','Weld','Weld']})
my_df['date'] = pd.to_datetime(my_df['date'],infer_datetime_format=True)
my_df
    id  date                machine group
0   1   2019-01-01 07:59:54 A       Grind
1   2   2019-01-01 08:00:07 A       Grind
2   3   2019-01-01 08:00:07 B       Weld
3   4   2019-01-02 08:00:14 C       Grind
4   5   2019-01-02 08:00:16 B       Weld
5   6   2019-01-02 08:00:24 C       Grind
6   7   2019-01-03 08:02:38 D       Weld
7   8   2019-01-03 08:50:14 D       Weld

I have tried this:

fig, ax = plt.subplots(figsize=(12,6))
my_df.groupby([pd.Grouper(key='date', freq='D'), 'group'])['machine'].count().plot(ax=ax)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d %m'))
plt.show()

But it gives me this wrong plot:

enter image description here

Please, could you help me on what I am doing wrong with my code? Any help will be greatly appreciated.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Alexis
  • 2,104
  • 2
  • 19
  • 40

1 Answers1

1

unstack group after groupby count to put groups as columns so they plot in separate lines:

fig, ax = plt.subplots(figsize=(12, 6))
# Create Plot DataFrame
plot_df = (
    my_df.groupby([
        pd.Grouper(key='date', freq='D'), 'group'
    ])['machine'].count().unstack('group')
)
# Plot on ax
plot_df.plot(ax=ax)
# Set Display
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d %m'))
plt.tight_layout()
plt.show()

plot_df:

group       Grind  Weld
date                   
2019-01-01    2.0   1.0
2019-01-02    2.0   1.0
2019-01-03    NaN   2.0

Plot:

plot


Data and imports:

import pandas as pd
from matplotlib import pyplot as plt, dates as mdates

my_df = pd.DataFrame({
    'id': [1, 2, 3, 4, 5, 6, 7, 8],
    'date': ['2019-01-01 07:59:54', '2019-01-01 08:00:07',
             '2019-01-01 08:00:07',
             '2019-01-02 08:00:14', '2019-01-02 08:00:16',
             '2019-01-02 08:00:24',
             '2019-01-03 08:02:38', '2019-01-03 08:50:14'],
    'machine': ['A', 'A', 'B', 'C', 'B', 'C', 'D', 'D'],
    'group': ['Grind', 'Grind', 'Weld', 'Grind', 'Weld',
              'Grind', 'Weld', 'Weld']
})
my_df['date'] = pd.to_datetime(my_df['date'])
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • Thank you very much @Henry Ecker, you know it was quite difficult for me to guess what was happening with the plot, and your answer is great. Hope it helps other python learners too! Have a great day! – Alexis Aug 12 '21 at 23:01