I am exploring a dataset for accidents in UK between 2005 and 2015. I converted the column in a datetime format, removed some columns and created new. Anyway, I clean further my dataset and leave only the needed columns (to simplify the example):
date
Accident_Index
200501BS00001 2005-04-01
200501BS00002 2005-05-01
200501BS00003 2005-06-01
200501BS00004 2005-07-01
200501BS00005 2005-10-01
I tried to plot a line chart for the accidents for all the months during the years:
acc_by_year_and_month = acc_data["date"].groupby([acc_data.date.dt.year, acc_data.date.dt.month]).agg("count")
acc_by_year_and_month.plot(kind='line', figsize = (8,6))
plt.ylabel("Number of accidents")
plt.xlabel("Year and Month")
plt.title("Number of accidents by year")
plt.show()
Unfortunately, this shows only 4-5 year-month combination on the X-axis and it is not easy to be explored where the peaks are and where the min values for every year.
I tried also to create an interactive chart importing:
%matplotlib notebook
import matlotlib.pyplot as plt
However, then when moving the pointer of the mouse over the chart I indeed get x and y values, but they are the same and the year-month combination is not shown, so this option did not help me too.
I expect to get either an interactive line chart where I can move the pointer of the mouse and this will show me the x and y values (x=2005-1, y=17487). OR: I think this will be the easier option: I want to print the minimum values for accidents for all the years:
2005 - 2 - 14383 (In Feb 2005 there were 14383 accidents which is the min value for 2005).
2006 - 2 - 13818 (In Feb 2006 there were 13818 accidents which is the min value for 2006)
..
and so on till year 2015.
If I print the variable acc_by_year_and_month I get something very close to the desired print. Then I get:
2005 - 1 - 17487
- 2 - 14383
...
2006 - 1 - 16026
...
So I have to find the min value for each year and print it out.