0

I have a DataFrame of lists and would like to pick 6 random items out of the list in a row, every picked item should be saved in the same row but in a different column.

DataFrame:

id  column_with_lists
1   ['10A','11B','12C','13D','14E','15F','16G','17H','18I']
2   ['34X','35Y','46Z','48A','49B','50C','51D']
3   ['232H', '23Q', '26W', '89D', '328A', '219C', '432G', '324A']

Desired result:

id  col     col1    col2     col3   col4    col5
1   '10A'   '14E'   '11B'   '18I'   '17H'   '13D'
2   '46Z'   '48A'   '49B'   '50C'   '51D'   '34X'
3   '232H'  '26W'   '89D'   '328A'  '432G'  '324A'

Edit: My DataFrame contains two columns, the first column ist the id and the second one the lists of ids column_with_lists. My goal is to get 6 random ids from the list within column_with_lists. Every picked id should be saved in the same row but different column of a dataframe

I was thinking about something like: ['col'] = df.apply( lambda x: random.sample( x['column_with_lists'], 1), axis=1) but for multiple columns.

jonas
  • 392
  • 2
  • 13
  • Does this answer your question? [Shuffle DataFrame rows](https://stackoverflow.com/questions/29576430/shuffle-dataframe-rows) – godot Feb 09 '21 at 22:11
  • Unfortunately not, I was thinking about something like: df['col'] = df.apply( lambda x: random.sample( x['column_with_lists'], 1), axis=1) but for multiple columns. – jonas Feb 09 '21 at 22:15

2 Answers2

1

I solved my problem by converting my column_with_lists to a list and shuffling the items in the sublists. After that I concatenate it back to my DataFrame.

import random 

def shuffle_list(var):
    for i in var:
        random.shuffle(i)
    return var

col_list= [item[:6] for item in shuffle_list(col_list)]
jonas
  • 392
  • 2
  • 13
0

If your data is in multiple arrays, you could do something like this:

import random

data = {
    array1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g'],
    array2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g'],
    array3 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']

output = {[],[],[]}

i = 0
for array in data:
    j = 0
    while j < 6:
        output[i].append(random.choice(array))
        j += 1
    i = 0

# POSSIBLE OUTPUT
output = {
    ['a', 'c', 'f', 'd', 'a', 'b'],
    ['g', 'a', 'b', 'f', 'g', 'c'],
    ['c', 'c', 'a', 'g', 'b', 'f']

If you do not want repeats, you can check if the value has been added to the output array and not add it if it is already there.