8

I have the following lists:

aa = ['aa1', 'aa2', 'aa3', 'aa4', 'aa5']
bb = ['bb1', 'bb2', 'bb3', 'bb4', 'bb5']
cc = ['cc1', 'cc2', 'cc3', 'cc4', 'cc5']

I want to create a pandas dataframe as such:

aa    bb    cc
aa1   bb1   cc1
aa2   bb1   cc1
aa3   bb1   cc1
aa4   bb1   cc1
aa5   bb1   cc1
aa1   bb2   cc1
aa1   bb3   cc1
aa1   bb4   cc1
aa1   bb5   cc1
aa1   bb1   cc2
aa1   bb1   cc3
aa1   bb1   cc4
aa1   bb1   cc5

I'm stuck as to how to do this. I've looked at examples: How to generate all permutations of a list in Python

I can do each permutation individually using:

import itertools
itertools.permutations(['aa1','aa2','aa3','aa4','aa5'])

I have a few tens of lists and ideally, I'd like to do them automatically.

Appreciate any help!

cs95
  • 379,657
  • 97
  • 704
  • 746
Kvothe
  • 1,341
  • 7
  • 20
  • 33

2 Answers2

13

I believe you need itertools.product, not permutations.

In [287]: lists = [aa, bb, cc]

In [288]: pd.DataFrame(list(itertools.product(*lists)), columns=['aa', 'bb', 'cc'])
Out[288]: 
      aa   bb   cc
0    aa1  bb1  cc1
1    aa1  bb1  cc2
2    aa1  bb1  cc3
3    aa1  bb1  cc4
4    aa1  bb1  cc5
5    aa1  bb2  cc1
6    aa1  bb2  cc2
7    aa1  bb2  cc3
8    aa1  bb2  cc4
...

This will give you the Cartesian product of your lists. As of now, the column names are hardcoded, but you can use df.rename to dynamically rename them.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • with *lists slightly more elegant than what I was about to post. However, I am not 100% sure yet whether OP wants the Cartesian product or a piecewise product, which would follow similar logic, but be a bit more involved. :) – Uvar Aug 14 '17 at 10:29
  • @Uvar Sure. Thanks for the feedback! – cs95 Aug 14 '17 at 10:35
0

I would suggest creating 3 dataframes and then adding them up, as such:

aa = ['aa1', 'aa2', 'aa3', 'aa4', 'aa5']
bb = ['bb1', 'bb2', 'bb3', 'bb4', 'bb5']
cc = ['cc1', 'cc2', 'cc3', 'cc4', 'cc5']

df1= pd.DataFrame({'aa':aa})
df1['bb']= 'bb1'
df1['cc']= 'cc1'

df2= pd.DataFrame({'bb':bb[1:]})
df2['aa']= 'aa1'
df2['cc']= 'cc1'

df3= pd.DataFrame({'cc':cc[1:]})
df3['bb']= 'bb1'
df3['aa']= 'aa1'

df= df1.append(df2).append(df3)

It should return your desired dataframe,
I hope I helped!

Johny Vaknin
  • 267
  • 2
  • 3
  • 10