0

I have the following data frame and list values

df_merge = pd.DataFrame({'column1': [0.5, 0.4, 0.9, 0.7],
               'column2': [0.7, 0.8, 0.2, 0.38],
               'column3': [0.6, 0.8, 0.3, 0.67],
               'column4': [0.1, 0.35, 0.55, 0.6],
               'group': ['1ab', '2ab', '3ab', '4ab'],
               'line': ['cc', 'gg', 'nn','pp'],
               'column5': ['-1', '-1', '0','0']})

list_0 = ['aa', 'bb', 'cc', 'dd', 'ee', 'ff']
list_1 = ['gg', 'hh', 'ii', 'jj', 'kk']
list_2 = ['ll', 'mm', 'nn']
list_3 = ['oo', 'pp']

Im trying to apply search function in the condition variable which will then be used in np.where function.

where df_merge['line'] can be any value from the list's above.

I tried the below, however not sure if that is the right approach and got an error " TypeError: unhashable type: 'list' "

This error resolved by using df_merge['line'].isin(list_0) for list

condition = [(df_merge['group'] == '1ab') & (df_merge['line'] == df_merge['line'].isin(list_0)),
         (df_merge['group'] == '2ab') & (df_merge['line'] == df_merge['line'].isin(list_1)),
         (df_merge['group'] == '3ab') & (df_merge['line'] == df_merge['line'].isin(list_2)),
         (df_merge['group'] == '4ab') & (df_merge['line'] == df_merge['line'].isin(list_3))]

After the above condition, i need to run the rest of the code

choices = [1 - (np.where(df_merge['column1'] >= 0.6, 0, 1) + np.where(df_merge['column2'] >= 0.6, 0, 1) + np.where(
df_merge['column3'] >= 0.6, 0, 1) + np.where(df_merge['column4'] >= 0.6, 0, 1)), 1 - (
                   np.where(df_merge['column1'] >= 0.6, 0, 1) + np.where(df_merge['column2'] >= 0.6, 0,
                                                                         1) + np.where(df_merge['column3'] >= 0.6,
                                                                                       0, 1)), 1 - (
                   np.where(df_merge['column1'] >= 0.6, 0, 1) + np.where(df_merge['column2'] >= 0.6, 0,
                                                                         1) + np.where(df_merge['column4'] >= 0.6,
                                                                                       0, 1)),
       1 - (np.where(df_merge['column1'] >= 0.6, 0, 1) + np.where(df_merge['column2'] >= 0.6, 0, 1))]

df_merge['column5'] = np.select(condition, choices, default = 1- (np.where(df_merge['column1'] >= 0.6, 0, 1) + np.where(df_merge['column2'] >= 0.6, 0, 1)))

Im am not sure if we can use np.where in choices as mentioned above. For which i got an error TypeError: '>=' not supported between instances of 'str' and 'float'. Error solved by converting string values in column1 - column4 to numerical.

The expected output will be:

df_merge = pd.DataFrame({'column1': ['0.5', '0.4', '0.9', '0.7'],
               'column2': ['0.7', '0.8', '0.2', '0.38'],
               'column3': ['0.6', '0.8', '0.3', '0.67'],
               'column4': ['0.1', '0.35', '0.55', '0.6'],
               'group': ['1ab', '2ab', '3ab', '4ab'],
               'line': ['cc', 'gg', 'nn','pp'],
               'column5': ['-1', '-1', '0','0']})

Any help / guidance is much appreciated.

Shri
  • 89
  • 2
  • 8
  • in your code, what is `df_merge`? is that `df?` – MattR Feb 25 '20 at 13:00
  • Yes. It’s the data frame.. updated question – Shri Feb 25 '20 at 13:00
  • 1
    So the error `TypeError: unhashable type: 'list'` is the thing you should focus on. `str.contains` is for strings, not lists. Try doing `df['line'].isin(list_0)` – MattR Feb 25 '20 at 13:02
  • `df['line'].isin(list_0)` worked. Thank you. How to solve for the next part `np.where` – Shri Feb 25 '20 at 13:11
  • `TypeError: '>=' not supported between instances of 'str' and 'float'.` Error solved by converting string values in column1 - column4 to numerical. Updated data frame df_merge – Shri Feb 25 '20 at 13:48

0 Answers0