0

I have a dataset like this:

        Freq
Date    
01-11   1
05-11   1
04-10   1
02-12   1
12-10   1
... ...
07-20   286
06-20   288
05-20   390
04-20   471
03-20   510

I would like to plot the frequency by month, to create 12 plots one per each month, showing the frequency through through years. My approach would be: use groupby to group all the months, then plot the histogram of frequency. But I have encountered a difficulty regarding how to extract months from Date. I have tried as follows:

df.groupby(df['Date'].dt.strftime('%m')).size().sort_values()

and then I used plot(kind=bar), trying to apply this once per each month. But the output is different from that expected.

  • `Any suggestion on how to..` - Spend some time with the [Pandas User guide](https://pandas.pydata.org/docs/user_guide/index.html) and the [Matplotlib tutorials](https://matplotlib.org/tutorials/index.html). ... Search and read lots of Q&A's – wwii Oct 31 '20 at 01:45
  • added code to show what I have tried –  Oct 31 '20 at 02:05

1 Answers1

0

IIUC, I think want to reshape you dataframe like this:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

s = pd.Series(np.random.randint(0,100, 120), index=pd.date_range('1/1/2010',periods=120, freq='MS'))

df = s.reset_index()

df['Year'] = df['index'].dt.year
df['month'] = df['index'].dt.month

ax = df.set_index(['Year', 'month'])[0].unstack('month').plot.bar(subplots=True, layout=(4,3), figsize=(15,10))
plt.suptitle('Values by Months of the Year over Ten-year period');

Chart:

enter image description here Input dataframe for chart:

print(df.set_index(['Year', 'month'])[0].unstack('month'))

Years on index and months as columns:

month  1   2   3   4   5   6   7   8   9   10  11  12
Year                                                 
2010   85  63  68  35   3  77  14  72  78  26  52  61
2011   47  92  37  89  24  62  76  10  88   6  48  77
2012   51   6  27  92  90   6  76  69  45  55   8   7
2013   45  45  53  29  32  13  23  39  82  68  83  41
2014   46   0  53  33  92  62  49  54  39  51  19  20
2015   55  81  95  67  75  93  92  94  67  72  40  26
2016    9  44  36  89  28  92  25  52  80  32  88  37
2017   52  60   0  34   9  72  70  42  88  18  51   3
2018   96  79  83  60  56  12   8  54  57  68   9  69
2019   32  98  47  23  35  28  59  81  86  32  50  52
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • Hi Scott, thanks for answering. I have got this key error: KeyError: 0 when I define ax. Do you know what it does mean? –  Oct 31 '20 at 02:40
  • Yes, my series has no name therefore, I am using a default column header as 0. You need to put your actual column header name instead of 0 after the set_index. – Scott Boston Oct 31 '20 at 02:41
  • Sorry again. I used both Freq and Date but I have got a ValueError: Index contains duplicate entries, cannot reshape. I think I am not understanding. Sorry –  Oct 31 '20 at 02:46
  • 1
    You'll need to use groupby and unstack instead of set_index and unstack.https://stackoverflow.com/q/28337117/6361531 – Scott Boston Oct 31 '20 at 04:03
  • Thanks @Scott Boston. I think I will need to ask for a new question as, even trying as suggested, I could not find a way to plot this values. –  Oct 31 '20 at 13:54