Pandas dataframe.plot mismatch with matplotlib.pyplot

Question

I'm working on a project that retrieves data from a MySQL database and plots it into a PDF. I've done this numerous times before, but never encountered a problem such as here.

The dataframe.plot() method gives a very strange xlim compared to when I do matplotlib.pyplot(x, y).

Here's my code (simplified):

from datetime import datetime
import matplotlib.dates as mdates
import matplotlib.pyplot as plt

import pandas


# For simplicity's sake, I'll just read the CSV here
data = pandas.read_csv(r"\\earth/various/home/metservices/A03/20190429/stationscheck/1224.csv", index_col=[0, 1], parse_dates=[0])

primary_station = 1224
data_prim = data[primary_station]
data_prim = data_prim.reset_index()
data_prim = data_prim.set_index(["dtg", "id_parameter"]).unstack(level=1)[primary_station].sort_index()

ax1 = plt.subplot(2, 1, 1)
data_prim.plot(ax=ax1, lw=.75)

ax2 = plt.subplot(2, 1, 2)
for column in data_prim.columns:
    ax2.plot(mdates.date2num(data_prim.index.values), data_prim[column], label=column, lw=.75)
ax2.legend()
ax2.xaxis.set_major_formatter(mdates.DateFormatter("%B"))

print(list(map(datetime.fromordinal, map(int, ax1.get_xlim()))))
print(list(map(datetime.fromordinal, map(int, ax2.get_xlim()))))

plt.show()

which results in the following image and output:

[datetime.datetime(1176, 8, 12, 0, 0), datetime.datetime(1185, 8, 25, 0, 0)]
[datetime.datetime(2018, 12, 26, 0, 0), datetime.datetime(2019, 5, 12, 0, 0)]

Here's a snippet my data via print(data.head().to_string()):

stationID                         1210  1212  1218  1220  1224  1232  1321  1361
dtg                 id_parameter
2019-01-01 06:00:00 404            NaN   NaN  14.0  16.0   4.6  26.2  13.0  13.9
                    405           29.1   NaN  27.8  20.1  38.8  49.9  57.6  32.1
2019-01-01 18:00:00 404            NaN   NaN  30.0  36.9   3.8  27.0  13.9  21.6
2019-01-02 06:00:00 404            NaN   NaN   4.4  13.2   1.5   6.4   3.6   4.8
                    405            2.4   NaN  34.4  50.1   5.3  33.4  17.5  26.4
2019-01-02 18:00:00 404            NaN   NaN   0.6   7.7   2.3   1.3   1.8   3.2
2019-01-03 06:00:00 404            NaN   NaN   2.0   6.5   1.2   2.1   2.6   1.0
                    405            7.8   NaN   2.7  14.2   3.5   3.4   4.4   4.2
2019-01-03 18:00:00 404            NaN   NaN   0.1   1.1   0.5   1.2   0.4   0.2
2019-01-04 06:00:00 404            NaN   NaN   3.8   7.6   8.5  12.5   3.4   1.9
                    405           13.0   NaN   3.9   8.7   9.0  13.7   3.8   2.1
2019-01-04 18:00:00 404            NaN   NaN   6.4  16.6  19.5  16.1   5.0   1.5
2019-01-05 06:00:00 404            NaN   NaN   2.0   2.0  10.0   0.0   5.0   0.0
                    405           34.2   NaN   8.3  18.2  29.3  16.1   9.7   1.5
2019-01-05 18:00:00 404            NaN   NaN   1.0   0.4   6.0   2.0   1.0   0.0
2019-01-06 06:00:00 404            NaN   NaN   0.6   1.5   2.0   1.8   0.3   0.0
                    405            9.5   NaN   1.9   1.8   7.9   3.5   1.6   0.0
2019-01-06 18:00:00 404            NaN   NaN   2.0   7.2   4.4   2.1   1.3   0.0
2019-01-07 06:00:00 404            NaN   NaN   0.6   2.8   1.2   0.3   0.7   0.0
                    405            2.2   NaN   2.6  10.0   5.6   2.4   2.0   0.0
2019-01-07 18:00:00 404            NaN   NaN   0.0   1.0   3.7   1.3   1.7   1.4
2019-01-08 06:00:00 404            NaN   NaN   0.5  10.6   6.6   9.3   0.0   0.8
                    405            8.3   NaN   0.5  11.6  10.3  10.6   1.7   2.2
2019-01-08 18:00:00 404            NaN   NaN   3.2   6.6   3.2   3.6   2.2   0.5
2019-01-09 06:00:00 404            NaN   NaN   4.3   3.0   4.2   2.6   0.9   0.3

As far as I know, the pandas plot methods are just wrappers around matplotlib, so why is there such a difference in the xlim? It's causing problems for my project when I use the image later on.

The reason I'm using plt.subplot(3, 1, 1) instead of something like fig, ax = plt.subplots(3, 1, 1) is because I'm also using a Cartopy GeoAxes which doesn't play nice without this construction.

Pandas uses its own units for dates/times. If you want it to use the same units as matplotlib, you can use the `x_compat` argument, `df.plot(..., x_compat=True)`. — ImportanceOfBeingErnest, May 07 '19 at 16:21
@ImportanceOfBeingErnest That is exactly what I'm looking for! Should've scoured the pandas Visualisation docs... Mind posting this as an answer so I can accept it? — Matthijs Kramer, May 08 '19 at 07:07
I added three duplicates which should be better suited than yet another answer with the same content. — ImportanceOfBeingErnest, May 08 '19 at 11:29

Pandas dataframe.plot mismatch with matplotlib.pyplot

0 Answers0