I need to "apply" a function to a DataFrame
row by row, by taking as input two particular cells of the current row for performing an operation. The function is the following:
def function(x, y):
z = 2*x*y
values.append(z)
return z
The problem is that the function shouldn't be really applied, I need only the input values to perform some operations and fill the global list called values
.
If we suppose the pd.DataFrame
to be the following:
| col1 | col2 | col3 |
| 2 | 3 | 5 |
| 10 | 12 | 14 |
| ... | ... | ... |
I would usually apply the function like this:
df.apply(lambda x: function(x['col2'], x['col3']), axis=1)
The problem with apply
is that the last line of code would create a pd.Series
and I would actually have in my memory not only the global list values
that I need for other purposes (I used this list as an example for some other data structure that could be created starting from the function) but also this Series
that I don't need at all.
How can I apply the function without occupying additional memory?