pandas resample changes numerical value of index

Question

Background: I'm trying to plot dissimilar pd.Series on the same axes. One of the Series has a much higher data rate, so I want to filter it (series.resample) and reduce the noise. The problem is, after resampling, it no longer plots on top of the low-data-rate Series.

EDIT: In addition, the data have different y-axis scales, so I'm using secondary_y=True. This somehow seems to be important, but I'm not sure why.

Proximate cause: I realized that the automatic x-axis limits coming back from matplotlib are very different after resampling. This means that the underlying numerical value of the index is changing. But I can't find anything in the pandas documentation about this.

EDIT: The xlim() output in the following code snippet demonstrates the change of x-axis limits. @masasa below points out that by issuing the plot commands together, both ds and ds_filt will plot successfully on the same axes. This is true even with secondary_y=True. However, my other Series does not plot successfully with ds_filt (not shown here because I don't even know how to reproduce the failure).

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

idx = pd.date_range('20190101', '20190103', freq='10s')
arr = np.random.randn(idx.size).cumsum()
ds = pd.Series(index=idx, data=arr)

ds.plot()
plt.xlim()

>>> (1546300800.0, 1546473600.0)

ds_filt = ds.resample('12H').mean()
ds_filt.plot()
plt.xlim()

>>> (429528.0, 429576.0)

Solved using the solution [here](https://stackoverflow.com/questions/29685887/secondary-y-true-changes-x-axis-in-pandas). Root cause is still not clear. — Ilya, Jun 04 '19 at 17:11

score 1 · Answer 1 · answered Jun 01 '19 at 09:42

1

I might didnt understand your problem correctly , but your 2 plots are not on the same figure, thats why they are seperated, if you do the following :

ds.plot()
ds.resample('12H').mean().plot()
plt.xlim()

youll get a merged graph

answered Jun 01 '19 at 09:42

masasa

260
1
9

You haven't explained the xlim() output in my original code snippet. But your result is interesting in itself. Let me go back and edit my post to clarify. – Ilya Jun 03 '19 at 16:02
What you saw is the time range in Unix time (seconds from 1970), when you sampled the data , it divided the time by 12*3600, that’s why your slim changed – masasa Jun 03 '19 at 16:08
Thanks, but why does it do that? And why does ```secondary_y``` have anything to do with it? – Ilya Jun 03 '19 at 16:24
Unix time is the official time in computers (don’t know the reason , google might know ) machine such as Linux, oracle etc work in Unix time , pandas itself works in Unix time for plots , since the x axis must be numbers (though you can use matplotlib features to convert them to numbers) – masasa Jun 03 '19 at 20:47

pandas resample changes numerical value of index

1 Answers1