5

I'm trying to plot a time series data, where for certain periods there is no data. Data is loaded into dataframe and I'm plotting it using df.plot(). The problem is that the missing periods get connected while plotting, giving an impression that value exists in that period, while it doesn't.

Here's an example of the problem

problem

There is no data between Sep 01 and Sep 08 as well as between Sep 09 and Sep 25, but the data is plotted in a way that it seems that there are values in that period.

I would like to have zero values visualized in that period, or no values at all. How to do that?

Just to be clear, I don't have NaN values for periods [Sep 01, Sep 08], [Sep 09, Sep 29], but no data at all (not even in the time index).

Kobe-Wan Kenobi
  • 3,694
  • 2
  • 40
  • 67

2 Answers2

4

Consider the pd.Series s

s = pd.Series(
    np.arange(10), pd.date_range('2016-03-31', periods=10)
).replace({3: np.nan, 6: np.nan})

s.plot()

enter image description here

You can see the np.nan were skipped.
However:

s.fillna(0).plot()

enter image description here

0s are not skipped.

I suggest s.replace(0, np.nan).plot()

piRSquared
  • 285,575
  • 57
  • 475
  • 624
3

You should add the missing dates to your dataframe, with NaN values. Then, when plotted, those NaNs break the line -- you will get several line segments, with empty periods between them.

This answer explains best how to add the missing dates to your dataframe. To summarize it, this should do the trick:

df = df.reindex(pd.DatetimeIndex(df.index), fill_value=NaN)
Community
  • 1
  • 1
shx2
  • 61,779
  • 13
  • 130
  • 153
  • Better is `df = df.reindex(pd.DatetimeIndex(df.index))`, it add `NaN` by default. If want specify `NaN` - `df = df.reindex(pd.DatetimeIndex(df.index), fill_value=np.nan)` – jezrael Jan 27 '17 at 09:55