0

I am attempting to slice a pandas dataframe by column labels using .loc. Based on Pandas documentation, https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html, .loc seems like the right indexer for the use case.

Original pandas DataFrame and confirmation the columns w/ labels exists:

enter image description here

The column labels as dynamically constructed and passed as list to slice the dataframe.

# Create dictionaries 
prop_dict = dict(zip(df_list.id, df_list.Company))
city_dict = dict(zip(df_list.id, df_list.city))   

# Lookup keys (property ids) from prop_dict
propKeys = getKeysByValue(prop_dict, landlord)
cityKeys = getKeysByValue(city_dict, market)

prop_list = list(set(propKeys) & set(cityKeys))
print(prop_list)

[19, 27]

# Slice dataframe 
df_temp = df_t.loc[:, prop_list]

However, this throws an error KeyError: 'None of [[19, 27]] are in the [columns]'

Full traceback here:

Traceback (most recent call last):
  File "/Platform/Deploy/tabs/market.py", line 279, in render_table
    result = top_leads(company, market)
  File "/Platform/Deploy/return_leads.py", line 86, in top_leads
    df_temp = df_matrix.loc[:, prop_list]
  File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1472, in __getitem__
    return self._getitem_tuple(key)
  File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 890, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
  File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1901, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1143, in _getitem_iterable
    self._validate_read_indexer(key, indexer, axis)
  File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1206, in _validate_read_indexer
    key=key, axis=self.obj._get_axis_name(axis)))
KeyError: 'None of [[19, 27]] are in the [columns]'
sentence
  • 8,213
  • 4
  • 31
  • 40
kms
  • 1,810
  • 1
  • 41
  • 92
  • You havent added `proplist` as a column. Simple interpretation `df.loc[:,proplist]` means all rows in the column `proplist`. Your proplist is a list not column in the datframe as far as I can see – wwnde Apr 30 '20 at 06:54
  • `proplist` is a list of columns passed to slice the dataframe. List elements 19 and 27 are column labels. – kms Apr 30 '20 at 07:02
  • Please, provide code and data (the DataFrame) in a (re)usable format. Do NOT use pictures for them. Thanks. – sentence Apr 30 '20 at 09:18

1 Answers1

1

Is it possible the columns '19' and '27' are located as the 19th and 27th column and that is why the first time it gives you the appropriate result because of the integer value of the 'names' 19 and 27. If you want to pass it as a list there need to be ''s around the names of the column, meaning it should be ['19','27'] instead of [19,27]

Dnorious
  • 55
  • 7
  • 19 and 27 are random ids and mapped to company. So, they're not positional indexers. 19 and 27 are column names as returned when running: `df_t.columns`. – kms Apr 30 '20 at 19:36
  • 1
    The colour green in your pictures indicates that 19 and 27 are used as integers and thus probably as positions. If you want column names without using df.column you should surround them with 'column'. See this link if it is not clear: https://stackoverflow.com/questions/46307490/how-can-i-extract-the-nth-row-of-a-pandas-data-frame-as-a-pandas-data-frame – Dnorious May 01 '20 at 08:50
  • This was the indeed the issue. They're integers and I had to change the list elements to string type. `prop_list1 =[str(i) for i in prop_list] df_temp = df_matrix.loc[:, prop_list1]` – kms May 01 '20 at 18:29