0

I have 3 lists -

Name = ["ABC", "DEF", "GHI"]
Year = [2016,2017]
Month = ["Aug","Jul","Jun"]

I want to create a dataframe from these lists as follows -

df -
Name Year Month
ABC  2016 Aug
ABC  2016 Jul
ABC  2016 Jun
ABC  2017 Aug
ABC  2017 Jul
ABC  2017 Jun
DEF  2016 Aug
DEF  2016 Jul
DEF  2016 Jun
DEF  2017 Aug
DEF  2017 Jul
DEF  2017 Jun
..... and so on

for all values in the lists. Is there any method in python(pandas or numpy or scipy) to perform this? Or is looping the only way to perform this?

cs95
  • 379,657
  • 97
  • 704
  • 746

1 Answers1

0

Use itertools.product:

pd.DataFrame(list(itertools.product(Name, Year, Month)), 
                          columns=['Name', 'Year', 'Month'])

   Name  Year Month
0   ABC  2016   Aug
1   ABC  2016   Jul
2   ABC  2016   Jun
3   ABC  2017   Aug
4   ABC  2017   Jul
5   ABC  2017   Jun
6   DEF  2016   Aug
7   DEF  2016   Jul
8   DEF  2016   Jun
9   DEF  2017   Aug
10  DEF  2017   Jul
11  DEF  2017   Jun
12  GHI  2016   Aug
13  GHI  2016   Jul
14  GHI  2016   Jun
15  GHI  2017   Aug
16  GHI  2017   Jul
17  GHI  2017   Jun

If you want a fast numpy cartesian product, I'd suggest looking at

Substituting product for a numpy alternative should be simple. All that's left to do is to call the pd.DataFrame constructor.

cs95
  • 379,657
  • 97
  • 704
  • 746