2

I am getting a Keyerror 'stackoverflow' when I run my code.

e0 = pd.read_csv(working_dir+"E0.txt",sep=',')
e0['MTM'] = e0['stack_over_flow']

I did output the columns of e0 and I do get stack_over_flow in my columns.

b'Super_user'
b'Personal_finance'
b'stack_over_flow'

I also tried removing the b manually from the .txt file and still get the same error. Can anyone help with this?

traceback:

Traceback (most recent call last):

  File "<ipython-input-74-99e71d524b4b>", line 1, in <module>
    runfile('C:/AppData/FinRecon/py_code/python3/DataJoin.py', wdir='C:/AppData/FinRecon/py_code/python3')

  File "C:\Users\stack\AppData\Local\Continuum\anaconda3\anaconda3_32bit\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "C:\Users\stack\AppData\Local\Continuum\anaconda3\anaconda3_32bit\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/AppData/FinRecon/py_code/python3/DataJoin.py", line 474, in <module>
    M2()

  File "C:/AppData/FinRecon/py_code/python3/DataJoin.py", line 41, in M2
    e0['MTM'] = e0['stack_over_flow']

  File "C:\Users\stack\AppData\Local\Continuum\anaconda3\anaconda3_32bit\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:\Users\stack\AppData\Local\Continuum\anaconda3\anaconda3_32bit\lib\site-packages\pandas\core\indexes\base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'stack_over_flow'

Update, I figured it out it is the b and '' before and between each header. Why does this get added to my .txt file?

excelguy
  • 1,574
  • 6
  • 33
  • 67
  • [Provide a copy of the data](https://stackoverflow.com/questions/52413246/how-do-i-provide-a-reproducible-copy-of-my-existing-dataframe) or the csv file. – Trenton McKinney Aug 19 '19 at 16:53

3 Answers3

1

I changed your data, something like below Say data for E0.txt is like below.

stackoverflow,"some column name", test
1, 2, 3

Use below code to retreive the content of any column.

e0 = pd.read_csv(working_dir+"E0.txt",sep=',')
e0['MTM'] = e0['stack_over_flow']

enter image description here

-- update --

without b I created a test sample, it works for below input

Super_user,Personal_finance,stack_over_flow
1, 2, 3
ajayramesh
  • 3,576
  • 8
  • 50
  • 75
0

Assuming the pandas dataframe is loaded, use the following method to select the columns,

df[['stack_over_flow']]
0

I suspect the problem lies in the encoding when reading the csv. Try adding encoding='utf-8' inside your read_csv call.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

vtnate
  • 133
  • 1
  • 9