1

I have one dataframe in Python :

Winner_height Winner_rank Loser_height Loser_rank
183 15 185 32
195 42 178 12

And I would like to get a mixed database keeping the information about both players and a field that allows me to identify the winner (0 if it is player 1, 1 if it is player 2), as below :

Player_1_height Player_1_rank Player_2_height Player_2_rank Winner
183 15 185 32 0
178 12 195 42 1

Is there an efficient way to mix groups of columns with pandas, i.e. without drawing a random number for each row and creating a duplicate database?

Thanks in advance

Joe4297
  • 11
  • 2
  • What constitutes a winner? –  May 08 '22 at 14:50
  • The information about the winner is included in the names of all the columns in the first table, and the purpose of the transformation is to dedicate a column, and no longer all the columns to this information, in order to apply a machine learning algorithm to predict the winner column – Joe4297 May 08 '22 at 15:06
  • Check here: https://stackoverflow.com/questions/29576430/shuffle-dataframe-rows – Inputvector May 08 '22 at 15:44
  • Thanks, but I don't understand how I could reuse this for my problem : here, it is not a matter of shuffling dataframe rows, but columns, or more precisely, to shuffle data with rules based on columns – Joe4297 May 08 '22 at 16:07
  • If I understand, you can use `np.where`: `df2["Winner"] = np.where(df2["Player_1_height"] == df1["Winner_height"], 0, 1)`. This will check if `winner_height` is the same as `player_1_height` and populate the `Winner` column. – Rawson May 08 '22 at 16:52

0 Answers0