3

I'm using a csv file containing date time, temperature and humidity data, then loading it using pandas and matplotlib pyplot to create graphs of the data.

With some date time values I'm getting the error "ValueError: year is out of range" with valid values, with no obvious pattern.

This is for a Raspberry Pi (fully updated via apt-get) with Python 2.7.13, pandas 0.19.2 and matplotlib 2.0.0. I've reviewed countless pages via Google searches but not got anything to work yet.

Note that I'm using a day/month/year format (I'm in New Zealand).

There is no obvious difference between good and bad lines, i.e. they all look to have valid date and time values plus identical line endings (CR LF).

The code which shows the issue:

    #!/usr/bin/env python2

    import pandas as pd
    import matplotlib.pyplot as plt
    import matplotlib.dates as mdates

    filename = 'tempfile.csv'

    df = pd.read_csv(filename, names=['DateTime','Air Temperature','Humidity','CPU Temperature'], header = 0)

    df['DateTime'] =  pd.to_datetime(df['DateTime'], format='%d/%m/%Y %H:%M', utc=None)

    df.plot(kind='line',x='DateTime',y='Air Temperature')
    ax = plt.gca()

    ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))

    plt.savefig('graph.png')
    plt.close()

Normally this results in graphs being generated, however some values result in errors.

Example data file:

16/06/2019 02:15,13.7,90.6,30.0
16/06/2019 02:30,13.7,92.2,30.0
16/06/2019 02:45,13.6,92.2,30.0
16/06/2019 03:00,13.4,92.0,30.0
16/06/2019 03:15,13.5,91.9,28.9
16/06/2019 03:30,13.5,91.9,28.9

Line which result in the error:

16/06/2019 03:15,13.5,91.9,28.9

A "fix" is to just modify the time component by reducing the minute value by 1, so changing the problem line to the following result avoids the error.

16/06/2019 03:14,13.5,91.9,28.9

Another "fix" is to only have either the first three lines or the last three lines of the csv data.

Another "fix" is commenting out the following line which also results in the code working, although with the loss of the x-axis formatting.

ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))

The fact that this works makes me think that the error message is incorrectly pointing to a year problem.

The full error message is:

    Traceback (most recent call last):
      File "./max.py", line 18, in <module> plt.savefig('graph.png')
      File "/usr/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 697, in savefig res = fig.savefig(*args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/figure.py", line 1572, in savefig self.canvas.print_figure(*args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/backend_bases.py", line 2244, in print_figure **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/backends/backend_agg.py", line 545, in print_png FigureCanvasAgg.draw(self)
      File "/usr/lib/python2.7/dist-packages/matplotlib/backends/backend_agg.py", line 464, in draw self.figure.draw(self.renderer)
      File "/usr/lib/python2.7/dist-packages/matplotlib/artist.py", line 63, in draw_wrapper draw(artist, renderer, *args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/figure.py", line 1143, in draw renderer, self, dsu, self.suppressComposite)
      File "/usr/lib/python2.7/dist-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images a.draw(renderer)
      File "/usr/lib/python2.7/dist-packages/matplotlib/artist.py", line 63, in draw_wrapper draw(artist, renderer, *args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/axes/_base.py", line 2409, in draw mimage._draw_list_compositing_images(renderer, self, dsu)
      File "/usr/lib/python2.7/dist-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images a.draw(renderer)
      File "/usr/lib/python2.7/dist-packages/matplotlib/artist.py", line 63, in draw_wrapper draw(artist, renderer, *args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/matplotlib/axis.py", line 1136, in draw ticks_to_draw = self._update_ticks(renderer)
      File "/usr/lib/python2.7/dist-packages/matplotlib/axis.py", line 969, in _update_ticks tick_tups = [t for t in self.iter_ticks()]
      File "/usr/lib/python2.7/dist-packages/matplotlib/axis.py", line 916, in iter_ticks for i, val in enumerate(majorLocs)]
      File "/usr/lib/python2.7/dist-packages/matplotlib/dates.py", line 466, in __call__ dt = num2date(x, self.tz)
      File "/usr/lib/python2.7/dist-packages/matplotlib/dates.py", line 401, in num2date return _from_ordinalf(x, tz)
      File "/usr/lib/python2.7/dist-packages/matplotlib/dates.py", line 254, in _from_ordinalf dt = datetime.datetime.fromordinal(ix).replace(tzinfo=UTC)
    ValueError: year is out of range

Updated to only show a minimum test case.

The statistics function was one that calculated the mean and standard deviation, which has been removed from the above code as it didn't impact the outcome.

NZ Kiwi
  • 31
  • 1
  • 6
  • What is this `statics` in `description, average, std = statics(df[data],units,dataType)`? – Imperishable Night Jun 15 '19 at 08:50
  • 1
    Please provide a [mcve] and state the version of pandas and matplotlib in use. – ImportanceOfBeingErnest Jun 15 '19 at 11:41
  • @Imperishable Night - this was a function to calculate the average and standard deviation. It has been removed as it didn't impact the issue. – NZ Kiwi Jun 15 '19 at 22:11
  • @ImportanceOfBeingErnest - I've reduced the code to the minimum to show the issue, plus listed the version numbers of pandas and matplotlib being used. – NZ Kiwi Jun 15 '19 at 22:12
  • 3
    From my investigations, I now believe this is a bug. It seems that the datetime values are mapped to numbers in different ways depending on whether the input data is detected as a time series. Since your original data has equal intervals between data points (15 minute), it is detected as a time series, which triggers the bug. – Imperishable Night Jun 15 '19 at 23:36
  • 3
    I see. The point is, you cannot safely use a matplotlib.dates formatter or locator with standard pandas plots, because they tend to use different units depending on the data. The easiest is to use the `x_compat=True` option when plotting. Similar issues are [this](https://stackoverflow.com/questions/50492185/cannot-set-datetime-ticks-when-using-pandas/50492239#50492239), [this](https://stackoverflow.com/questions/44213781/pandas-dataframe-line-plot-display-date-on-xaxis/44214830#44214830), or [this](https://stackoverflow.com/questions/48790378/how-to-get-ticks-every-hour/48791644#48791644). – ImportanceOfBeingErnest Jun 16 '19 at 00:26
  • 1
    Adding the x_compat=True to the plot command resolved this. Thank you! I'd been pulling my hair out trying to fix the code. – NZ Kiwi Jun 16 '19 at 02:02
  • 2
    Ideally one would now close this as duplicate of any of the above links, but I cannot cast any vote on this any more. Someone else could. – ImportanceOfBeingErnest Jun 16 '19 at 02:16
  • Does this answer your question? [Cannot set datetime ticks when using pandas](https://stackoverflow.com/questions/50492185/cannot-set-datetime-ticks-when-using-pandas) – Gonçalo Peres May 27 '21 at 15:56

0 Answers0