What is the most efficient way to merge multiple data frames (i.e., more than 2) in pandas? There are a few answers:
- pandas joining multiple dataframes on columns
- Pandas left outer join multiple dataframes on multiple columns
but these all involve multiple joins. If I have N data frames these would require N-1 joins.
If I weren't using pandas, another solution would be to just put everything into a hash table based on the common index as the key and build the final version. This is basically like a hash join in SQL I believe. Is there something like that in pandas?
If not, would it be more efficient to just create a new data frame with the common index and pass it the raw data from each data frame? It seems like that would at least prevent you from creating a new data frame in each of the N-1 joins.
Thanks.