0

My question is: How can I simplify my table with Pandas, to get only one column with the selected values (the three columns should be one).

Name    Selection   Active  Inactive
A       active      0       0.9
B       active      1       0.8
C       inactive    2       0.7
D       inactive    3       0.6
E       active      4       0.5

Like IF Selection = 'active' THEN Active ELSE Inactive as Selected_Value to get the following result:

Name    Selected_Value
A       0
B       1
C       0.7
D       0.6
E       4 
Red
  • 26,798
  • 7
  • 36
  • 58
Greenhorn
  • 13
  • 2

2 Answers2

1

The code below should provide you with what you are looking for.

df.loc[df['Selection'] == 'active','Selected_Value'] = df['Active']
df.loc[df['Selection'] == 'unactive','Selected_Value'] = df['Unactive']

or

idx,cols = pd.factorize(df['Selection'].str.title())
df.assign(Selected_Value = df.reindex(cols,axis=1).to_numpy()[range(len(df)),idx])

Output:

  Name Selection  Active  Inactive  Selected_Value
0    A    active       0       0.9             0.0
1    B    active       1       0.8             1.0
2    C  inactive       2       0.7             0.7
3    D  inactive       3       0.6             0.6
4    E    active       4       0.5             4.0
rhug123
  • 7,893
  • 1
  • 9
  • 24
0

Her is how you can use numpy.where():

import pandas as pd
import numpy as np
df = pd.DataFrame({'Name': ['A', 'B', 'C', 'D', 'E'],
                   'Selection': ['active', 'active', 'unactive', 'unactive', 'active'],
                   'Active': [0, 1, 2, 3, 4],
                   'Unactive': [0.9, 0.8, 0.7, 0.6, 0.5]})

df['Selected_Value'] = np.where(df['Selection']=='active', # If the element for the Selection column is active
                                df['Active'], # The element of the Selected_Value column of that index will be the element from the Active column
                                df['Unactive']) # Else, the element of the Selected_Value column of that index will be the element from the Unactive column
                           
print(df['Selected_Value'])

Output:

0    0.0
1    1.0
2    0.7
3    0.6
4    4.0
Name: Selected_Value, dtype: float64
Red
  • 26,798
  • 7
  • 36
  • 58