I'm trying to find a faster way to apply a function several times to a set of data housed in DataFrames.
I have two DataFrames:
- Parameters: has a column for each argument of the function, each row is a specific parameter set. There is also a column giving a unique name to each set.
- Original Data: houses original data in a column
For each set of parameters, I want to add a column to the original DataFrame with the result from "func" and set the column name to the parameter set name.
Currently I'm looping through the rows of the parameter DataFrame, but I feel like there's a better way to do it.
I'm trying to see if there's a vectorized solution, but so far I've been unsuccessful working with two DataFrames.
I've tried following cs95's answer in this post, but almost all of the examples for vectorization or list comprehensions are only dealing with a single DataFrame: How to iterate over rows in a DataFrame in Pandas
Is there a better way to do this?
I feel like there maybe something obvious I'm missing.
import pandas as pd
def func(data, a, b, c):
return data["original"] + a + b * c
parameters = pd.DataFrame(
{
"name": ["set_1", "set_2", "set_3"],
"a": [1, 2, 3],
"b": [4, 5, 6],
"c": [7, 8, 9],
}
)
data = pd.DataFrame({"original": [10, 11, 12, 13, 14, 15]})
for i, row in parameters.iterrows():
data[row["name"]] = func(data, row["a"], row["b"], row["c"])
Inputs:
Parameters DataFrame:
name a b c
0 set_1 1 4 7
1 set_2 2 5 8
2 set_3 3 6 9
Original Data DataFrame:
original
0 10
1 11
2 12
3 13
4 14
5 15
Output:
original set_1 set_2 set_3
0 10 39 52 67
2 12 41 54 69
3 13 42 55 70
4 14 43 56 71
5 15 44 57 72