I'd like to create a new column by randomly sampling data from the remaining columns.
Consider a dataframe with "N" columns as follows:
|---------------------|------------------|---------------------|
| Column 1 | Column 2 | Column N |
|---------------------|------------------|---------------------|
| 0.37 | 0.8 | 0.0 |
|---------------------|------------------|---------------------|
| 0.0 | 0.0 | 0.8 |
|---------------------|------------------|---------------------|
The resulting dataframe should look like
|---------------------|------------------|---------------------|---------------|
| Column 1 | Column 2 | Column N | Sampled |
|---------------------|------------------|---------------------|---------------|
| 0.37 | 0.8 | 0.0 | 0.8 |
|---------------------|------------------|---------------------|---------------|
| 0.0 | 0.0 | B | B |
|---------------------|------------------|---------------------|---------------|
| A | 5 | 0.8 | A |
|---------------------|------------------|---------------------|---------------|
The "Sampled" column's entries are created by randomly choosing one of the corresponding entries of the "N" columns. For example, "0.8" was chosen from Column 2, "B" from Column N, and so on.
df.sample(axis=1)
simply chooses one column and returns it. This is NOT what I want.
What would be the fastest way to achieve this? The method needs to be efficient as the original dataframe is big with lots of rows and columns.