I'm not sure why I am getting a KeyError: in the sample code below

Question

The CSV file I am importing has a column named 'MED EXP DATE'. I reference this same column earlier in the code with no problem. When I reference it later, I get a key error.

import pandas as pd
df = pd.read_csv(r"C:\Users\Desktop\Air_Rally_Marketing\PILOT_BASIC.csv", low_memory=False, dtype={'UNIQUE ID': str, 'FIRST NAME': str,'LAST NAME': str, 'STREET 1': str, 'STREET 2': str, 'CITY': str, 'STATE': str, 'ZIP CODE': str, 'MED DATE': str, 'MED EXP DATE': str})
df.dropna(inplace = True)
df['EXP DATE LEN'] = df['MED EXP DATE'].apply(len) #creates a column to store the length of the column MED EXP DATE
print(df.head)

This is the error I receive:

return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
  File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'MED EXP DATE'

When I search the meaning of this error, my understanding is that it means I am referencing a key that cannot be found. I'm confused by this because I reference "MED EXP DATE" in a prior line and do not get the key error there.

Are you sure that `df = df[df['STATE'].isin(state)]` does what you want? Also, see https://www.youtube.com/watch?v=NTaNksV-DPY — jarmod, Apr 26 '20 at 23:48
`print(df.columns)` most likely that column doesn't exist, it may have white space you need to strip — Umar.H, Apr 27 '20 at 00:08

score 1 · Accepted Answer · answered Apr 27 '20 at 00:26

This below:

df = Right = df['MED EXP DATE'].str[-4:]

Is turning your df variable into a Series instead of a Pandas Dataframe. So by the time it gets to the apply statement, it as no idea what you are referring to.

SOLUTION: Use a double brackets to ensure df remains a pandas DataFrame

Setting your df to a series

df = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 1, 'b': 3}])
df = df['a']
type(df)
<class 'pandas.core.series.Series'>

Retaining your df as a DataFrame

df = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 1, 'b': 3}])
df = df[['a']]
type(df)
<class 'pandas.core.frame.DataFrame'>

THANK YOU! I did an up vote but as a new member it apparently doesn't count. — sfav8r, Apr 27 '20 at 01:25

score 0 · Answer 2 · answered Apr 26 '20 at 23:57

0

I think the error is in this line: 'df = Right = df['MED EXP DATE'].str[-4:]' why two assignment operators (=) here?

answered Apr 26 '20 at 23:57

PRIN

344
1
7

1

this is okay if he wants that both ```df``` and ```Right``` have the same value as ```df['MED EXP DATE'].str[-4:]```. See here https://stackoverflow.com/a/27272057/12684122 – olenscki Apr 26 '20 at 23:59
yes you are correct @olenscki. Then something else is there! – PRIN Apr 27 '20 at 00:07

I'm not sure why I am getting a KeyError: in the sample code below

2 Answers2