I am trying use pandas DataFrame.combine
to combine multiple data frames. However, I couldn't figure out how to fulfill the func
parameter. The doc is not very clear to me. The documentation specifies:
DataFrame.combine(other, func, fill_value=None, overwrite=True)
other : DataFrame
func : function. Function that takes two series as inputs and return a Series or a scalar
fill_value : scalar value
overwrite : boolean, default True. If True then overwrite values for common keys in the calling frame
After some research, I found out that a similar command, DataFrame.combine_first
can be used with reduce
as below to combine multiple data frames (link):
reduce(lambda left,right: pd.DataFrame.combine_first(left,right), [pd.read_csv(f) for f in files])
How can I use DataFrame.combine
to combine multiple data frames?