1

Let we have following Panel:

companies = ["GOOG", "YHOO", "AMZN", "MSFT", "AAPL"]
p = data.DataReader(name = companies, data_source="google", start = "2013-01-01", end = "2017-02-22")

I want to extract values of "Low", "MSFT" for two dates "2013-01-02" and "2013-01-08". I use several options and some of them work, some not. Here are those methods:

  1. Using .ix[] method

p.ix["Low", [0,4], "MSFT"] and the result is:

Date
2013-01-02    27.15
2013-01-08    26.46
Name: MSFT, dtype: float64

So it works, no problem at all.

  1. Using .iloc[] method

p.iloc[2, [0,4], 3] and it also works.

Date
2013-01-02    27.15
2013-01-08    26.46
Name: MSFT, dtype: float64
  1. Using .ix[] method again but in different way

p.ix["Low", ["2013-01-02", "2013-01-08"], "MSFT"] and it returns weird result as:

Date
2013-01-02   NaN
2013-01-08   NaN
Name: MSFT, dtype: float64
  1. Using .loc[] method

p.loc["Low", ["2013-01-02", "2013-01-08"], "MSFT"] and this time an error raised

KeyError: "None of [['2013-01-02', '2013-01-08']] are in the [index]"

1 and 2 are the ones that work, and it is pretty straightforward. However, I don't understand the reason of getting NaN values in 3rd method and an error in 4th method.

Natig Aliyev
  • 379
  • 6
  • 18

1 Answers1

0

Using the following

In [116]: wp = pd.Panel(np.random.randn(2, 5, 4), items=['Low', 'High'],
.....:               major_axis=pd.date_range('1/1/2000', periods=5),
.....:               minor_axis=['A', 'B', 'C', 'D'])

It's good to remember that: .loc uses the labels in the index .iloc uses integer positioning in the index .ix tries acting like .loc but falls back to .iloc if it fails, so we can focus only on the .ix version source

If I do wp.ix['Low'] I get

               A         B         C         D
2000-01-01 -0.864402  0.559969  1.226582 -1.090447
2000-01-02  0.288341 -0.786711 -0.662960  0.613778
2000-01-03  1.712770  1.393537 -2.230170 -0.082778
2000-01-04 -1.297067  1.076110 -1.384226  1.824781
2000-01-05  1.268253 -2.185574  0.090986  0.464095

Now if you want to access the data for 2000-01-01 through 2000-01-03, you need to use the syntax

wp.loc['Low','2000-01-01':'2000-01-03']

which returns

                A         B         C         D
2000-01-01 -0.864402  0.559969  1.226582 -1.090447
2000-01-02  0.288341 -0.786711 -0.662960  0.613778
2000-01-03  1.712770  1.393537 -2.230170 -0.082778
Community
  • 1
  • 1
Некто
  • 1,730
  • 10
  • 17
  • Thanks for the comment, but panel doesn't have `set_index()` method. – Natig Aliyev Feb 22 '17 at 21:37
  • The logic for why it's failing is the same. You're using the index incorrectly. Perhaps trying using the reindex() method. Panels support that. http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Panel.reindex.html – Некто Feb 22 '17 at 21:39
  • If dates are not index, then why does `p.loc["Low", "2013-01-08", "MSFT"]` returns correct result. Shouldn't it also raise an error? – Natig Aliyev Feb 22 '17 at 21:41
  • You may be accessing another axis. What are your major/minor axes? Try p.axes – Некто Feb 22 '17 at 21:53
  • `Items axis: Open to Volume` `Major_axis axis: 2013-01-02 00:00:00 to 2017-02-21 00:00:00` `Minor_axis axis: AAPL to YHOO` – Natig Aliyev Feb 22 '17 at 21:55
  • I was wrong. Please see the above answer for the solution. It just requires different syntax. – Некто Feb 22 '17 at 22:24
  • Actually, yeah, slicing works, e.g `p.loc["Low", "2013-01-02":"2013-01-08", "MSFT"]` gives no error. However, my question is not about slicing, it is about list of specific dates, not a range. – Natig Aliyev Feb 22 '17 at 22:47
  • Yeah honestly, that's probably a bug. As per the documentation http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Panel.loc.html it is supposed to accept a list of labels – Некто Feb 22 '17 at 22:57