1

I have the following dataframe:

df = pd.DataFrame(data={'flag': ['col3', 'col2', 'col2'],
                        'col1': [1, 3, 2], 
                        'col2': [5, 2, 4], 
                        'col3': [6, 3, 6], 
                        'col4': [0, 4, 4]},
                        index=pd.Series(['A', 'B', 'C'], name='index'))
index flag col1 col2 col3 col4
A col3 1 5 6 0
B col2 3 2 3 4
C col2 2 4 6 4

For each row, I want to get the value when column name is equal to the flag.

index flag col1 col2 col3 col4 col_val
A col3 1 5 6 0 6
B col2 3 2 3 4 2
C col2 2 4 6 4 4

– Index A has a flag of col3. So col_val should be 6 because df['col3'] for that row is 6.
– Index B has a flag of col2. So col_val should be 2 because df['col2'] for that row is 2.
– Index C has a flag of col2. So col_val should be 4 because df['col2'] for that row is 3.

bltSandwich21
  • 432
  • 3
  • 10
  • 2
    you are doing a lookup; the pandas docs has an [example](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#looking-up-values-by-index-column-labels) that you can adapt to your question – sammywemmy Nov 24 '21 at 01:29

3 Answers3

1

Per this page:

idx, cols = pd.factorize(df['flag'])
df['COl_VAL'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]

Output:

>>> df
       flag  col1  col2  col3  col4  COl_VAL
index                                       
A      col3     1     5     6     0        6
B      col2     3     2     3     4        2
C      col2     2     4     6     4        4
0

The docs has an example that you can adapt; the solution is below is just another option.

What it does is flip the dataframe into a MultiIndex dataframe, select the relevant columns and trim it to non nulls::

 cols = [(ent, ent) for ent in df.flag.unique()]
 (df.assign(col_val = df.pivot(index = None, columns = 'flag')
                        .loc(axis = 1)[cols].sum(1)
 )

        flag  col1  col2  col3  col4  col_val
index
A      col3     1     5     6     0      6.0
B      col2     3     2     3     4      2.0
C      col2     2     4     6     4      4.0
sammywemmy
  • 27,093
  • 4
  • 17
  • 31
0

try this:

cond = ([df.columns.values[1:]] * df.shape[0]) == df.flag.values.reshape(-1,1)
df1 = df.set_index('flag', append=True)
df1.join(df1.where(cond).ffill(axis=1).col4.rename('res')).reset_index('flag')

ziying35
  • 1,190
  • 3
  • 6