tough one for me. I have data in ncDatad that has columns: SITE, data1, year, month, week and a second dataframe dff with columns: week, SITE, LAL. I need to find all rows by matching 'SITE' and 'week' number and assign the LAL data from dff to a new column in ncDatad. The column 'SITE' in dff has a suffix like '_1', '_2', etc. This is the following attempt, but i get a KeyError: 'SITE'.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File ~\Anaconda3\envs\Stats\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
exec(code, globals, locals)
File \\porfiler03.ar.local\gtdshare\goals_2023\maint_windows\raw_maint_between_risk_3vars_working.py:197
merged_df[lal_col] = merged_df.apply(
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\frame.py:9565 in apply
return op.apply().__finalize__(self, method="apply")
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:746 in apply
return self.apply_standard()
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:873 in apply_standard
results, res_index = self.apply_series_generator()
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:889 in apply_series_generator
results[i] = self.f(v)
File \\porfiler03.ar.local\gtdshare\goals_2023\maint_windows\raw_maint_between_risk_3vars_working.py:199 in <lambda>
if row['SITE'] in dff[lookup_col].values else None,
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\series.py:981 in __getitem__
return self._get_value(key)
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\series.py:1089 in _get_value
loc = self.index.get_loc(label)
File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\indexes\base.py:3805 in get_loc
raise KeyError(key) from err
KeyError: 'SITE'
My code attempt:
suffixes = [col.split('_')[1] for col in dff.columns if col.startswith('SITE_')]
merged_df = ncDatad.copy()
for suffix in suffixes:
lookup_col = f'SITE_{suffix}'
lal_col = f'LAL'
merged_df[lal_col] = merged_df.apply(
lambda row: dff.loc[(dff['week'] == row['week']) & (dff[lookup_col]== row['SITE']),'LAL'].iloc[0]
if row['SITE'] in dff[lookup_col].values else None,
axis = 1)
My data in dff looks like this:
dff.head()
Out[24]:
week SITE_1 LAL SITE_2 LAL
0 1 BARTON CHAPEL 1.1 PENASCAL I 1.0
1 2 BARTON CHAPEL 1.1 PENASCAL I 1.0
2 3 BARTON CHAPEL 1.1 PENASCAL I 1.0
3 4 BARTON CHAPEL 1.1 PENASCAL I 1.0
4 5 BARTON CHAPEL 1.1 PENASCAL I 1.0
THANK YOU for any help here.