I have a dataframe that looks like this
Error \nNumber Date Type of error
0 2122 2020-01-09 NHS Spine check - External error
1 2123 2020-01-09 EP3- Run failure
2 2124 2020-02-09 NHS Spine check - External error
3 2125 2020-03-09 NHS Spine check - External error
4 2126 2020-04-09 NHS Spine check - External error
.. ... ... ...
837 2949 2023-03-07 DE - Data Entry
838 2950 2023-03-07 EI - Error of interpretation
839 2951 2023-03-07 EX -External error - other
840 2952 2023-04-07 EP8- SOPs not being followed
841 2953 2023-06-07 OT - Other
Here a reproducible data set converted in a dictionary
data.head().to_dict()
{'Error \nNumber': {0: '2122', 1: '2123', 2: '2124', 3: '2125', 4: '2126'},
'Date': {0: Timestamp('2020-01-09 00:00:00'),
1: Timestamp('2020-01-09 00:00:00'),
2: Timestamp('2020-02-09 00:00:00'),
3: Timestamp('2020-03-09 00:00:00'),
4: Timestamp('2020-04-09 00:00:00')},
'Type of error ': {0: 'NHS Spine check - External error',
1: 'EP3- Run failure',
2: 'NHS Spine check - External error',
3: 'NHS Spine check - External error',
4: 'NHS Spine check - External error'}}
I am trying to count the values in column "Type of error", group them by months and split data in new columns like this (following example only takes the first 5 rows
Date NHS Spine check - External error EP3- Run failure
2020-09 4 1
I have tried the answers found here but the differences prevent me to do what I want