1

I have 2 pandas dataframes:

modal2:

   Mode month1  month2  month3  month4  month5  month6  month7  month8  month9  month10 month11 month12
0   100 0   0   0   0   0   0   0   0   0   0   0   0
2   602 0   2   1   0   2   1   0   2   1   0   2   1
3   603 1   0   2   1   0   2   1   0   2   1   0   2
11  802 0   11  10  9   8   7   6   5   4   3   2   1

df_ia:

      RevalMonth_plus  Mode
0     1                602
35    1                100
52    4                100
79    1                802 
94    4                603
95    4                603
96    4                603 
98    1                100 

the logic is compare each row of df_ia to each row of modal2,if row of df_ia has the same Mode number as row of modal2 , then add the column month1 of modal2 to df_ia.

I can successfully complete this part by the following code:

 for index, row in modal2.iterrows():
        conditions.append(df_ia['Mode'] == row['Mode'])


    for col in ['month1']: #for loop list is because sometime need more than 1 column  
        col_v = modal2[col]
        df_ia[col] = np.select(conditions, col_v, default=None)

the output is:

      RevalMonth_plus  Mode  month1
0     1                602   0
35    1                100   0
52    4                100   0
79    1                802   0 
94    4                603   1
95    4                603   1
96    4                603   1 
98    1                100   0  

But the challenge part is:

how can I replace 'month1' to the corresponding 'RevalMonth_plus' from df_ia, just like this:

for col in [f'month{df_ia['RevalMonth_plus']}']: #for loop list is because sometime need more than 1 column  
    col_v = modal2[col]
    df_ia[col] = np.select(conditions, col_v, default=None)

The ideal output should like:

      RevalMonth_plus  Mode  dynamic_moth_value
0     1                602   0                  #month1
35    1                100   0                  #month1
52    4                100   0                  #month4
79    1                802   0                  #month1
94    4                603   1                  #month4
95    4                603   1                  #month4
96    4                603   1                  #month4
98    1                100   0                  #month1

I have spent a whole day on it.But cannot make it.

William
  • 3,724
  • 9
  • 43
  • 76

3 Answers3

2

Try with pd.wide_to_long to reshape the DataFrame modal2, then merge left to keep order of df_ia:

df_long = pd.wide_to_long(modal2, stubnames='month',
                          i='Mode', j='RevalMonth_plus').reset_index()
new_df = df_ia.merge(
    df_long,
    how='left',
    on=['RevalMonth_plus', 'Mode']
).rename(columns={'month': 'dynamic_month_value'})

new_df:

   RevalMonth_plus  Mode  dynamic_month_value
0                1   602                    0
1                1   100                    0
2                4   100                    0
3                1   802                    0
4                4   603                    1
5                4   603                    1
6                4   603                    1
7                1   100                    0

An example with unique values to show more clearly the mapping:

modal2 = pd.DataFrame({
    'Mode': [100, 602, 603, 802],
    'month1': [1, 2, 3, 4],
    'month4': [5, 6, 7, 8]
})

df_ia = pd.DataFrame({
    'RevalMonth_plus': [1, 1, 4, 1, 4, 4, 4, 1],
    'Mode': [602, 100, 100, 802, 603, 603, 603, 100]
})

modal2:

   Mode  month1  month4
0   100       1       5
1   602       2       6
2   603       3       7
3   802       4       8

df_ia:

   RevalMonth_plus  Mode
0                1   602
1                1   100
2                4   100
3                1   802
4                4   603
5                4   603
6                4   603
7                1   100

new_df:

   RevalMonth_plus  Mode  dynamic_month_value
0                1   602                    2
1                1   100                    1
2                4   100                    5
3                1   802                    4
4                4   603                    7
5                4   603                    7
6                4   603                    7
7                1   100                    1

All Month columns can be mapped with merge first:

# Merge left to bring all Month columns into new_df
new_df = df_ia.merge(modal2, on='Mode', how='left')

df_long = pd.wide_to_long(modal2, stubnames='month',
                          i='Mode', j='RevalMonth_plus').reset_index()
new_df = new_df.merge(
    df_long,
    how='left',
    on=['RevalMonth_plus', 'Mode']
).rename(columns={'month': 'dynamic_month_value'})
   RevalMonth_plus  Mode  month1  month4  dynamic_month_value
0                1   602       2       6                    2
1                1   100       1       5                    1
2                4   100       1       5                    5
3                1   802       4       8                    4
4                4   603       3       7                    7
5                4   603       3       7                    7
6                4   603       3       7                    7
7                1   100       1       5                    1
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • Hi friend,thank you so much for your help,what if I want to get 2 columns? – William Jul 08 '21 at 22:35
  • Dynamically? or just month value columns? mapped by Mode? As the others have mentioned, you can `merge` to bring the columns over. – Henry Ecker Jul 08 '21 at 22:36
  • Not dynamically just hard coding for example month7 – William Jul 08 '21 at 22:55
  • See the merge code at the end then just filter `new_df` to contain the columns desired. Something like -> `new_df[["RevalMonth_plus", "Mode", "month7", "dynamic_month_value"]]` – Henry Ecker Jul 08 '21 at 22:58
  • Hi friend can you help me check this:https://stackoverflow.com/questions/68370860/how-to-use-if-else-in-pandas-numpy-when-apply-function-in-all-the-rows-fast – William Jul 14 '21 at 01:08
  • Hi friend can you help me with this very similar question:https://stackoverflow.com/questions/68371165/numpy-ndarray-object-has-no-attribute-str-while-using-if-else-in-numpy-panda – William Jul 14 '21 at 02:06
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:39
2

Another solution:

x = pd.merge(df_ia, modal2, on="Mode", how="left")
x["dynamic_month_value"] = x.apply(
    lambda x: x["month" + str(x["RevalMonth_plus"])], axis=1
)
print(x[["RevalMonth_plus", "Mode", "dynamic_month_value"]])

Prints:

   RevalMonth_plus  Mode  dynamic_month_value
0                1   602                    0
1                1   100                    0
2                4   100                    0
3                1   802                    0
4                4   603                    1
5                4   603                    1
6                4   603                    1
7                1   100                    0
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Thank you so much for your help, what if I need get another column at the same time for example month7 ? can I do something like: x = pd.merge(df_ia, modal2, on="Mode", how="left",'month7') x["dynamic_month_value"] = x.apply( lambda x: x["month" + str(x["RevalMonth_plus"])], axis=1 ) print(x[["RevalMonth_plus", "Mode", "dynamic_month_value"]]) – William Jul 08 '21 at 22:53
  • 1
    `x` already has all of the month columns from merge. `x[["RevalMonth_plus", "Mode", "month7", "dynamic_month_value"]]` – Henry Ecker Jul 08 '21 at 23:00
  • Hi friend can you help me check this:https://stackoverflow.com/questions/68370860/how-to-use-if-else-in-pandas-numpy-when-apply-function-in-all-the-rows-fast – William Jul 14 '21 at 01:08
  • Hi friend can you help me with this very similar question:https://stackoverflow.com/questions/68371165/numpy-ndarray-object-has-no-attribute-str-while-using-if-else-in-numpy-panda – William Jul 14 '21 at 02:07
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:39
1

It is almost always wrong to iterate over dataframe rows. You need to merge the dataframes:

df_ia['dynamic_month_value'] = modal2.merge(df_ia).set_index(df_ia.index)['month1']

#    RevalMonth_plus  Mode  dynamic_month_value
#0                 1   602       0
#35                1   100       0
#52                4   100       0
#79                1   802       0
#94                4   603       1
#95                4   603       1
#96                4   603       1
#98                1   100       0
DYZ
  • 55,249
  • 10
  • 64
  • 93
  • Thank you for your reply but the Mode need be same. – William Jul 08 '21 at 22:17
  • It is the same. – DYZ Jul 08 '21 at 22:17
  • Hi friend can you help me check this:https://stackoverflow.com/questions/68370860/how-to-use-if-else-in-pandas-numpy-when-apply-function-in-all-the-rows-fast – William Jul 14 '21 at 01:08
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:39