At the moment I have 9 functions which do specific calculations to a data frame - average balance per month included, rolling P&L, period start balances, ratio calculation.
Each of those functions produce the following: the first columns are the group by columns which the function accepts and the final column is the statistic calculation.
I.e.
Each of those functions produce a spark data frame that has the same group by variables(same first columns - 1 column if the group by variables is only 1, 2 columns if the group by variables are 2, etc.) and 1 column where the values are the specific calculation - examples of which I listed at the beginning.
Because each of those functions do different calculations, I need to produce a data frame for each one and then join them to produce a report
I join them on the group by variables because they are common in all of them(each individual statistic report).
But doing 7-8 and even more joins is very slow.
Is there a way to add those columns together without using join?
Thank you.