0

I have two data frames, tourney_data and season_statistics_T1. I tried to merge these two data frames as follows, which gives me the following error message. I do not understand the reason.

tourney_data = pd.merge(tourney_data, season_statistics_T1, on = ['Season', 'T1_TeamID'], how = 'left')

The head data of tourney_data is

{'Season': {0: 2010, 1: 2010, 2: 2010, 3: 2010, 4: 2010},
 'DayNum': {0: 138, 1: 138, 2: 138, 3: 138, 4: 138},
 'T1_TeamID': {0: 3124, 1: 3173, 2: 3181, 3: 3199, 4: 3207},
 'T1_Score': {0: 69, 1: 67, 2: 72, 3: 75, 4: 62},
 'T2_TeamID': {0: 3201, 1: 3395, 2: 3214, 3: 3256, 4: 3265},
 'T2_Score': {0: 55, 1: 66, 2: 37, 3: 61, 4: 42}}

However, I got the key error message for season_statistics_T1, it worked just fine when get information using season_statistics_T1.head(). But when I use season_statistics_T1.head().to_dict(). I got the KeyError: 'Season'


KeyError                                  Traceback (most recent call last)
<ipython-input-76-09e6432ab186> in <module>
----> 1 tourney_data = pd.merge(tourney_data, season_statistics_T1, on = ['Season', 'T1_TeamID'], how = 'left')

~\Anaconda3\envs\vscodeds\lib\site-packages\pandas\core\reshape\merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     85         copy=copy,
     86         indicator=indicator,
---> 87         validate=validate,
     88     )
     89     return op.get_result()

~\Anaconda3\envs\vscodeds\lib\site-packages\pandas\core\reshape\merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
    650             self.right_join_keys,
    651             self.join_names,
--> 652         ) = self._get_merge_keys()
    653 
    654         # validate the merge keys dtypes. We may need to coerce

~\Anaconda3\envs\vscodeds\lib\site-packages\pandas\core\reshape\merge.py in _get_merge_keys(self)
   1003                     if not is_rkey(rk):
   1004                         if rk is not None:
-> 1005                             right_keys.append(right._get_label_or_level_values(rk))
   1006                         else:
   1007                             # work-around for merge_asof(right_index=True)

~\Anaconda3\envs\vscodeds\lib\site-packages\pandas\core\generic.py in _get_label_or_level_values(self, key, axis)
   1561             values = self.axes[axis].get_level_values(key)._values
   1562         else:
-> 1563             raise KeyError(key)
   1564 
   1565         # Check for duplicates

KeyError: 'Season'

enter image description here

user288609
  • 12,465
  • 26
  • 85
  • 127
  • 1
    Have you checked your column names to make sure there aren't leading or trailing spaces? It would help us to help you better if you would [edit] to include your sample data in the text of your question rather than a picture with `df.head().to_dict()`, see [How to make good pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – G. Anderson Mar 17 '21 at 18:21
  • @G.Anderson, I tried to output information using df.head().to_dict(). It turned out that one of the dataframes has keyerror messages even though it works fine for using season_statistics_T1.head() only. I included this in the original post – user288609 Mar 17 '21 at 19:38
  • That's a weird place to get a `KeyError`, what about the output of `season_statistics_T1.columns` ? – G. Anderson Mar 17 '21 at 20:58
  • @G.Anderson, I found that the error should comes from the season_statistics_T1. To make problem statement clear, I created another post, listing all the related information. Thank you! – user288609 Mar 18 '21 at 02:35

0 Answers0