0

I have the following script to plot data from a csv file:

csv:

Date,% responding right,% responding wrong
02/08/16,46,42
09/08/16,45,44
...
21/08/18,41,47
29/08/18,42,47
04/09/18,42,48

script:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

df = pd.read_csv("brexit.csv")
df['Date'] = pd.to_datetime(df['Date'])
df.sort_values(by=['Date'])
plt.plot(df['Date'], df['% responding right'])
plt.plot(df['Date'], df['% responding wrong'])
plt.gcf().autofmt_xdate()
plt.savefig('brexit.png', dpi=300)

Result:

enter image description here

I tried to modify my script so the dates on the x axis are properly formatted, and the dates only show the year as it increases incrementally:

df = pd.read_csv("brexit.csv")
df['Date'] = pd.to_datetime(df['Date'])
df.sort_values(by=['Date'])
plt.plot(df['Date'], df['% responding right'])
plt.plot(df['Date'], df['% responding wrong'])
plt.gcf().autofmt_xdate()
plt.savefig('brexit.png', dpi=300)

But unfortunately this seems to mess up the order of my y values and still displays the full date instead of just the year:

enter image description here

How can I modify my code so the graph is displayed properly?

Yes
  • 339
  • 3
  • 19
  • I'd take a look at the comprehensive answers here https://stackoverflow.com/questions/45704366/how-to-change-the-datetime-tick-label-frequency-for-matplotlib-plots (where you'll probably want to swap out the [`MonthLocator`](https://matplotlib.org/stable/api/dates_api.html#matplotlib.dates.MonthLocator) in those answers for a [`YearLocator`](https://matplotlib.org/stable/api/dates_api.html#matplotlib.dates.YearLocator) to set the intervals.) – Matt Pitkin Mar 28 '23 at 12:58
  • Does this answer your question? [pandas to\_datetime parsing wrong year](https://stackoverflow.com/questions/37766353/pandas-to-datetime-parsing-wrong-year) – Jody Klymak Mar 28 '23 at 15:58
  • I think you need to make sure your dates are parsed correctly – Jody Klymak Mar 28 '23 at 15:58

2 Answers2

0

you can try to use only the pandas plot they automatically resolve this labelling date problem:

import pandas as pd

df = pd.read_csv('brexit.csv')
df.index = df['Date']
df = df.drop(columns=['Date'],axis=1)
df.plot()

Obs. It's possible to use matplotlib functions with pandas plot to customize your graph

enter image description here

Lucas
  • 26
  • 4
0
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.dates

df = pd.DataFrame({'Date':['02/08/16', '09/08/16', '21/08/18', '29/08/18', '04/09/18', '24/09/19', '04/09/21'],
                   '% responding right':[45, 46, 41, 42, 42, 43, 44],
                   '% responding wrong':[42, 44, 47, 47, 48, 49, 50]})

df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%y')


fig, ax = plt.subplots()
ax.plot(df['Date'], df['% responding right'])
ax.plot(df['Date'], df['% responding wrong'])
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%Y'))
locator = matplotlib.dates.YearLocator(month=8, day=2)
#locator = matplotlib.ticker.LinearLocator(4)
ax.xaxis.set_major_locator (locator)


fig.autofmt_xdate()
plt.show()

The 'Date' column was converted to the correct format by applying format, since the days and months were mixed up.

Used '%Y' in matplotlib.dates.DateFormatter to only output the year.

In YearLocator set the day and month of the first date.

You can also show for example only 4 values (comment out the row YearLocator and uncomment LinearLocator).

inquirer
  • 4,286
  • 2
  • 9
  • 16