0

I have the following dataframe:

import numpy as np
import pandas as pd
import random
df = pd.DataFrame(np.random.randint(0,20,size=(2, 2)), columns=list('AB'))
df

    A   B
0   13  4
1   16  17

Then I create another dataframe in a loop where the columns of the dataframe are lists. There is a post here (Pandas split column of lists into multiple columns) that shows to split the columns.

tmp_lst_1 = []
for index, row in df.iterrows():
    tmp_lst_2 = []
    for r in range(len(row)):
        tmp_lst_2.insert(r, random.sample(range(1, 50), 2) )
    tmp_lst_1.insert(index, tmp_lst_2)

df1 = pd.DataFrame(tmp_lst_1)
df1
     0             1
0   [21, 5]     [6, 42]
1   [49, 40]    [8, 45]

but I was wondering if there is a more efficient way to create this dataframe without needing to split all the columns individually? I am looking to get something like this:

df1
    C  D  E F  
0   21 5  6 42
1   49 40 8 45
Alex Man
  • 457
  • 4
  • 19
  • should be this the code for the df dataframe with size (2,4) instead of size(2,2)? df = pd.DataFrame(np.random.randint(0,20,size=(2, 4)), columns=list('ABCD')) – David Erickson Mar 22 '20 at 04:51
  • also, I believe you incorrectly have these two lines of code at the top of the second portion rather than at the bottom. df1 = pd.DataFrame(tmp_lst_1) df1 – David Erickson Mar 22 '20 at 04:55
  • Thanks @DavidErickson. I updated the code. – Alex Man Mar 22 '20 at 05:57

1 Answers1

0

I think loop by DataFrame.iterrows here is not necessary, you can use nested list comprehension with flattening lists:

df = pd.DataFrame(np.random.randint(0,20,size=(2, 2)), columns=list('AB'))

tmp_lst_1 = [[x for r in range(len(df.columns)) 
                for x in random.sample(range(1, 50), 2)] 
                for i in range(len(df))]

df1 = pd.DataFrame(tmp_lst_1, index=df.index)
print (df1)
    0   1   2   3
0  23  24  42  48
1  26  43  24   5

Alternative without list comprehension:

tmp_lst_1 = []

for i in range(len(df)):
    flat_list = []
    for r in range(len(df.columns)):
        for x in random.sample(range(1, 50), 2):
            flat_list.append(x)
    tmp_lst_1.append(flat_list)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252