0

Problem: Attempting to filter one dataframe using a mask that compares its index to the index of another dataframe. I've created a boolean mask, but my attempts return only an empty dataset.

Question: Is there a way to mask using indexes, or perhaps by converting the indexes to columns and comparing the dataframes?

The mask looks like:

my_towns.set_index(['State', 'RegionName'], inplace=True)
index1 = my_towns.index
index2 = qtr_data.index
qtr_data[qtr_data.index.isin(my_towns.index)]

The result is an empty dataframe.

Test datasets my_towns and qtr_data. These dfs both have multi-index. A simple example could be created from the example below following the masking steps above, i.e, use df2.index as a filter on df1's rows.

df1 = pd.DataFrame({'State' : ['Hawaii', 'Alabama', 'California', 'Washington'], 'Region' : ['Honolulu', 'Mobile', 'Los Angeles', 'Spokane'], '2001q1' : ['123','345','456','567']}).set_index(['State','Region'])

df2 = pd.DataFrame({'State' : ['Alabama', 'California', 'Washington'], 'Region' : ['Mobile', 'Los Angeles', 'Spokane']}).set_index(['State', 'Region'])
Adestin
  • 153
  • 3
  • 15
  • 1
    Please don't post images of text. And read http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – John Zwinck Jan 28 '17 at 00:27
  • Thanks @JohnZwinck. Example datasets forthcoming. – Adestin Jan 28 '17 at 00:54
  • This seems to work fine with your example data? `df1[df1.index.isin(df2.index)]` produces the expected result for me. – root Jan 28 '17 at 03:20
  • Thank for testing on the simple example I published, @root. Though, if I try that exact recipe on the actual test datasets, I get an empty dataframe returned, which shouldn't be the case. Any ideas? – Adestin Jan 28 '17 at 03:49
  • You can also try `df1.loc[df1.index.intersection(df2.index)]`. But it looks to me like that intersection is empty and because of this you are getting empty data set – MaxU - stand with Ukraine Jan 28 '17 at 08:47

0 Answers0