Create a pandas DataFrame from a Cartesian product of two large lists

Question

I'm looking for the simplest way to create a data frame from two others such that it contains all combinations of their elements. For instance we have these two dataframes:

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

The result must be:

   0   1
0  A  x1
1  A  x2
2  A  x3
3  A  x4
4  A  x5
5  A  x6
6  A  x7
7  A  x8
8  B  x1
9  B  x2

I tried to combine from the lists and it works fine with small lists but not for the large ones. Thank you

Can you elaborate a bit about the desired output? How did you end up with the desired output included in your question? — Grzegorz Skibinski, May 25 '20 at 20:51
Does this answer your question? [Get all combinations of elements from two lists?](https://stackoverflow.com/questions/25634489/get-all-combinations-of-elements-from-two-lists) — Georgy, May 29 '20 at 12:11

score 6 · Answer 1 · answered May 25 '20 at 20:45

6

You can use itertools.product:

import itertools
import pandas as pd

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]
result = pd.DataFrame(list(itertools.product(list1, list2)))

answered May 25 '20 at 20:45

João Victor

407
2
10

Thanks for answering. That's what I did. But it does not work for big dataframes – Mus May 25 '20 at 21:04

score 4 · Accepted Answer · answered May 25 '20 at 20:43

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

df1['key'] = 0
df2['key'] = 0
print( df1.merge(df2, on='key', how='outer').drop(columns='key') )

Prints:

   0_x 0_y
0    A  x1
1    A  x2
2    A  x3
3    A  x4
4    A  x5
5    A  x6
6    A  x7
7    A  x8
8    B  x1
9    B  x2

...

score 3 · Answer 3 · edited Jun 12 '20 at 00:00

You want to join each element in df1 with all elements of df2.

You can do it using df.merge:

In [1820]: df1['tmp'] = 1   ## Create a dummy key in df1
In [1821]: df2['tmp'] = 1   ## Create a dummy key in df2

## Merge both frames on `tmp`
In [1824]: df1.merge(df2, on='tmp').drop('tmp', 1).rename(columns={'0_x': '0', '0_y':'1'}) 
Out[1824]: 
    0   1
0   A  x1
1   A  x2
2   A  x3
3   A  x4
4   A  x5
5   A  x6
6   A  x7
7   A  x8
8   B  x1
9   B  x2
10  B  x3
11  B  x4
12  B  x5
13  B  x6
14  B  x7
15  B  x8
16  C  x1
17  C  x2
18  C  x3
...
...

Create a pandas DataFrame from a Cartesian product of two large lists

3 Answers3

Linked

Related