0

What's the best way to select specific individual rows from a timeseries?

I've got some data:

indexed = df.set_index("Month").resample("Q-MAR", how="sum")['count']
indexed

Month
2010-12-31     6942
2011-03-31    23677
2011-06-30    24131
2011-09-30    23144
2011-12-31    22249
2012-03-31    24216
2012-06-30    22938
2012-09-30    25468
2012-12-31    21733
2013-03-31    21385
2013-06-30    23093
2013-09-30    26206
2013-12-31    22248
2014-03-31    20737
2014-06-30    23384
2014-09-30    25285
2014-12-31    22210
2015-03-31    22627
2015-06-30    25185
2015-09-30    27038
2015-12-31    25352
2016-03-31    16694
Freq: Q-MAR, Name: count, dtype: int64

I can slice out individual entries, like this:

indexed.ix["2012-09-30"]
25468

indexed.ix["2015-09-30"]
27038

But how do I select both?

I tried:

indexed.ix[["2012-09-30", "2015-09-30"],:]

Which returns a too many indexers error...

While

indexed.ix[["2012-09-30", "2015-09-30"]]

brings back

Month
2012-09-30   NaN
2015-09-30   NaN

And I don't understand why

indexed.ix["2012-09-30", "2015-09-30"]

returns

25468

Could someone please explain it to me, please?

elksie5000
  • 7,084
  • 12
  • 57
  • 87

2 Answers2

1

access by index works as expected if you specify correct dtypes:

In [13]: df.loc[[pd.to_datetime('2012-09-30'),pd.to_datetime('2015-09-30')]]
Out[13]:
            Count
Month
2012-09-30  25468
2015-09-30  27038

In [14]: df.loc[['2012-09-30','2015-09-30']]
Out[14]:
            Count
Month
2012-09-30    NaN
2015-09-30    NaN

UPDATE:

starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
0

I found this answer useful on indexing: pandas, python - how to select specific times in timeseries

indexed[(pd.Index(indexed.index.quarter).isin([3])) & (pd.Index(indexed.index.year).isin([2012,2015]))]

Month
2012-09-30    25468
2015-09-30    27038
Community
  • 1
  • 1
elksie5000
  • 7,084
  • 12
  • 57
  • 87