1

I have some python code that runs a simple for loop and prints out every combination of results, and I am trying to figure out how to append this all to a single dataframe, based on the order the results are produced in. I will explain below.

I have the following code:

categories = ['small', 'medium', 'big']
parameters = ['p1_5_p2_4_p3_2', 'p1_3_p2_8_p3_3', 'p1_4_p2_3_p3_6']
Blue = [5, 4, 3]

for parameter in parameters:
    for category in categories:
        for x in Blue:
            y = x + 1
            z = x + 2
            
            print(category)
            print(parameter)
            print(y)
            print(z)
            print('')

which produces:

small
p1_5_p2_4_p3_2 
6 
7

small 
p1_5_p2_4_p3_2 
5 
6

small 
p1_5_p2_4_p3_2 
4 
5

medium 
p1_5_p2_4_p3_2 
6 
7

medium 
p1_5_p2_4_p3_2 
5 
6

medium 
p1_5_p2_4_p3_2 
4 
5

big 
p1_5_p2_4_p3_2 
6 
7

big 
p1_5_p2_4_p3_2 
5 
6

big
p1_5_p2_4_p3_2 
4 
5 

small
p1_3_p2_8_p3_3
6 
7
...

Is there a way to just send this to a pandas dataframe so that the dataframe looks like:

Category      Parameters         Value_1    Value_2
----------------------------------------------------
small         p1_5_p2_4_p3_2           6          7 
small         p1_5_p2_4_p3_2           5          6
small         p1_5_p2_4_p3_2           4          5
medium        p1_5_p2_4_p3_2           6          7
medium        p1_5_p2_4_p3_2           5          6
medium        p1_5_p2_4_p3_2           4          5
big           p1_5_p2_4_p3_2           6          7
big           p1_5_p2_4_p3_2           5          6
big           p1_5_p2_4_p3_2           4          5
small         p1_3_p2_8_p3_3           6          7   
...  

Is there a way to organize my initial outputs into this dataframe?

Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
  • Yeah - check out the docs on [`pd.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) - there are lots of ways to do this. – Michael Delgado Aug 20 '21 at 17:01
  • 1
    Does this answer your question? [cartesian product in pandas](https://stackoverflow.com/questions/13269890/cartesian-product-in-pandas) – Michael Delgado Aug 20 '21 at 17:03

3 Answers3

8

You can use itertools.product:

from itertools import product

categories = ["small", "medium", "big"]
parameters = ["p1_5_p2_4_p3_2", "p1_3_p2_8_p3_3", "p1_4_p2_3_p3_6"]
Blue = [5, 4, 3]

df = pd.DataFrame(
    product(categories, parameters, np.array(Blue) + 1, np.array(Blue) + 2),
    columns=["Category", "Parameters", "Value_1", "Value_2"],
)
print(df)

Prints:

   Category      Parameters  Value_1  Value_2
0     small  p1_5_p2_4_p3_2        6        7
1     small  p1_5_p2_4_p3_2        6        6
2     small  p1_5_p2_4_p3_2        6        5
3     small  p1_5_p2_4_p3_2        5        7
4     small  p1_5_p2_4_p3_2        5        6

...
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
1

itertools.product is the most pythonic way to do this. However, if you want to use the code you already have, you're almost right there

#create a list to append your values into
data=[]

categories = ['small', 'medium', 'big']
parameters = ['p1_5_p2_4_p3_2', 'p1_3_p2_8_p3_3', 'p1_4_p2_3_p3_6']
Blue = [5, 4, 3]

for parameter in parameters:
    for category in categories:
        for x in Blue:
            y = x + 1
            z = x + 2

            #append instead of printing
            row=[category,parameter,y,z]
            data.append(row)

#create your dataframe
my_df=pd.DataFrame(columns=['Category','Parameters','Value_1','Value_2'], data=data)

  Category  Parameters      Value_1 Value_2
0   small   p1_5_p2_4_p3_2  6       7
1   small   p1_5_p2_4_p3_2  5       6
2   small   p1_5_p2_4_p3_2  4       5
3   medium  p1_5_p2_4_p3_2  6       7
4   medium  p1_5_p2_4_p3_2  5       6
G. Anderson
  • 5,815
  • 2
  • 14
  • 21
0

Create a list to hold the values, where the list item is also a list, forming a 2D list. Finally, pass it to pd.DataFrame to create the dataframe out of this list.

categories = ['small', 'medium', 'big']
parameters = ['p1_5_p2_4_p3_2', 'p1_3_p2_8_p3_3', 'p1_4_p2_3_p3_6']
Blue = [5, 4, 3]
ls = []
for parameter in parameters:
    for category in categories:
        for x in Blue:
            y = x + 1
            z = x + 2
            ls.append([category, parameter, y, z])
df = pd.DataFrame(ls, columns=['Category', 'Parameter', 'Value_1', 'Value_2'])

   Category       Parameter  Value_1  Value_2
0     small  p1_5_p2_4_p3_2        6        7
1     small  p1_5_p2_4_p3_2        5        6
2     small  p1_5_p2_4_p3_2        4        5
3    medium  p1_5_p2_4_p3_2        6        7
4    medium  p1_5_p2_4_p3_2        5        6
5    medium  p1_5_p2_4_p3_2        4        5
6       big  p1_5_p2_4_p3_2        6        7
7       big  p1_5_p2_4_p3_2        5        6
8       big  p1_5_p2_4_p3_2        4        5
9     small  p1_3_p2_8_p3_3        6        7
10    small  p1_3_p2_8_p3_3        5        6
11    small  p1_3_p2_8_p3_3        4        5
12   medium  p1_3_p2_8_p3_3        6        7
13   medium  p1_3_p2_8_p3_3        5        6
14   medium  p1_3_p2_8_p3_3        4        5
15      big  p1_3_p2_8_p3_3        6        7
16      big  p1_3_p2_8_p3_3        5        6
17      big  p1_3_p2_8_p3_3        4        5
18    small  p1_4_p2_3_p3_6        6        7
19    small  p1_4_p2_3_p3_6        5        6
20    small  p1_4_p2_3_p3_6        4        5
21   medium  p1_4_p2_3_p3_6        6        7
22   medium  p1_4_p2_3_p3_6        5        6
23   medium  p1_4_p2_3_p3_6        4        5
24      big  p1_4_p2_3_p3_6        6        7
25      big  p1_4_p2_3_p3_6        5        6
26      big  p1_4_p2_3_p3_6        4        5
ThePyGuy
  • 17,779
  • 5
  • 18
  • 45