Lookup Data in a DataFrame and assign to 2nd DataFrame when Condition Met

Question

tough one for me. I have data in ncDatad that has columns: SITE, data1, year, month, week and a second dataframe dff with columns: week, SITE, LAL. I need to find all rows by matching 'SITE' and 'week' number and assign the LAL data from dff to a new column in ncDatad. The column 'SITE' in dff has a suffix like '_1', '_2', etc. This is the following attempt, but i get a KeyError: 'SITE'.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File ~\Anaconda3\envs\Stats\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File \\porfiler03.ar.local\gtdshare\goals_2023\maint_windows\raw_maint_between_risk_3vars_working.py:197
    merged_df[lal_col] = merged_df.apply(

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\frame.py:9565 in apply
    return op.apply().__finalize__(self, method="apply")

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:746 in apply
    return self.apply_standard()

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:873 in apply_standard
    results, res_index = self.apply_series_generator()

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\apply.py:889 in apply_series_generator
    results[i] = self.f(v)

  File \\porfiler03.ar.local\gtdshare\goals_2023\maint_windows\raw_maint_between_risk_3vars_working.py:199 in <lambda>
    if row['SITE'] in dff[lookup_col].values else None,

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\series.py:981 in __getitem__
    return self._get_value(key)

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\series.py:1089 in _get_value
    loc = self.index.get_loc(label)

  File ~\Anaconda3\envs\Stats\lib\site-packages\pandas\core\indexes\base.py:3805 in get_loc
    raise KeyError(key) from err

KeyError: 'SITE'

My code attempt:

suffixes = [col.split('_')[1] for col in dff.columns if col.startswith('SITE_')]
merged_df = ncDatad.copy()
for suffix in suffixes:
    lookup_col = f'SITE_{suffix}'
    lal_col = f'LAL'
    merged_df[lal_col] = merged_df.apply(
    lambda row: dff.loc[(dff['week'] == row['week']) & (dff[lookup_col]== row['SITE']),'LAL'].iloc[0]
    if row['SITE'] in dff[lookup_col].values else None,
    axis = 1)

My data in dff looks like this:

dff.head()
Out[24]: 
   week         SITE_1                   LAL      SITE_2                   LAL
0     1  BARTON CHAPEL                   1.1  PENASCAL I                   1.0
1     2  BARTON CHAPEL                   1.1  PENASCAL I                   1.0
2     3  BARTON CHAPEL                   1.1  PENASCAL I                   1.0
3     4  BARTON CHAPEL                   1.1  PENASCAL I                   1.0
4     5  BARTON CHAPEL                   1.1  PENASCAL I                   1.0

THANK YOU for any help here.

Are you trying to [merge](https://stackoverflow.com/questions/53645882/pandas-merging-101)? — Quang Hoang, May 30 '23 at 23:18

Lookup Data in a DataFrame and assign to 2nd DataFrame when Condition Met

0 Answers0