3

Overview

How do you populate a pandas dataframe using math which uses column and row indices as variables.

Setup

import pandas as pd
import numpy as np

df = pd.DataFrame(index = range(5), columns = ['Combo_Class0', 'Combo_Class1', 'Combo_Class2', 'Combo_Class3', 'Combo_Class4'])

Objective

Each cell in df = row index * (column index + 2)

Final Objective

Attempt 1

You can use this solution to produce the following code:

row = 0
for i in range(5):
    row = row + 1
    df.loc[i] = [(row)*(1+2), (row)*(2+2), (row)*(3+2), (row)*(4+2), (row)*(4+2), (row)*(5+2)]

Attempt 2

This solution seemed relevant as well, although I believe I've read you're not supposed to loop through dataframes. Besides, I'm not seeing how to loop through rows and columns:

for i, j in df.iterrows(): 
    df.loc[i] = i
PizzaAndCode
  • 340
  • 1
  • 3
  • 12

2 Answers2

3

You can leverage broadcasting for a more efficient approach:

ix = (df.index+1).to_numpy() # .values for pandas 0.24< 
df[:] = ix[:,None] * (ix+2)

print(df)

        Combo_Class0  Combo_Class1  Combo_Class2  Combo_Class3  Combo_Class4
0             3             4             5             6             7
1             6             8            10            12            14
2             9            12            15            18            21
3            12            16            20            24            28
4            15            20            25            30            35
yatu
  • 86,083
  • 12
  • 84
  • 139
  • This returns the error _AttributeError: 'RangeIndex' object has no attribute 'to_numpy'_. Curious how you were able to print your results... – PizzaAndCode Jun 01 '19 at 17:13
  • 2
    Yes I do mention using `.values` for pandas versions under 0.24 in a comment. Use `.values` instead of `to_numpy` @PizzaAndCode – yatu Jun 01 '19 at 17:15
  • 2
    Ah my brain misinterpreted your comment, it makes sense now. It works perfectly using ,values(). Fantastic! Thank you for the follow-up @yatu. This is a more general solution, so marking it as the accepted answer. – PizzaAndCode Jun 01 '19 at 17:25
2

Using multiply outer

df[:]=np.multiply.outer((np.arange(5)+1),(np.arange(5)+3))
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Python for Data Analysis 2nd Ed (ISBN 978-1-491-95766-0), pg 473 contains code similar to this. Thank you for sharing! – PizzaAndCode Jun 01 '19 at 17:16