The main reason why this doesn't work is that asign
doesn't modify the existing dataframe in place, but instead return a new dataframe object.
What you want to do is to apply the same function to several objects, that's exactly what the map
function is made for:
def assign(df):
return df.assign(c = lambda x: x.a+x.b,
d = lambda x: x.a^x.b)
(a, b) = map(assign, (a,b))
A more general solution is the following:
# Imagine we don't have control over the following line of code:
dataframes = (a, b)
# We can still use the same solution:
dataframes = tuple(map(assign, dataframes))
print(dataframes[0])
Concerning your edit, the reason why this doesn't work is a bit more interesting. It may not seem obvious in your code, but it will be in this one:
a = [1, 2, 3]
data = a
data = [4, 5, 6]
print(data)
Here there it is clear that this output [4, 5, 6]
and not [1, 2, 3]
.
What happen in both your code and this last one is the same:
data = a
: data
is binded to the same object as a
(resp. b
)
data = ...
: creates a new binding, leaving the existing binding of a
untouched (as data
was only binded to the same object as a
, data
never was a
).
In the end, for data in [a, b]:
doesn't mean that data
will be an alias for a
(resp. b
) during the next iteration. (Which is what you may expect when writing this.) Instead for data in [a, b]:
simply is equivalent to:
data = a
# 1st iteration
data = b
# 2nd iteration