0

I have 32 separate lists of dataframes. I need to merge each list together, I am expecting 32 different dataframes. I know how to merge 1 list of dataframes together, but I am currently doing the calculation 32 different times. I was wondering if there was a simple way to do the same calculation? I currently have this and I know I'm creating a new variable but I don't know how to assign it back to the originial list from the input. "Weather_List" is a list containing every list that contains the dataframes that need to be merged.

sample lists, data and data_day1 contain dataframes that are named "snow", "temp",etc.. weather_list contains the data and data_day1 lists

    data = [snow, temp, windspd]
    data_day1 = [snow_day1, temp_day1, windspd_day1]
    weather_list = [data, data_day1]
    def mergedf(item):
        reduce(lambda left,right: pd.merge(left,right,on=['Latitude','Longitude'], how = 'outer'), item


    [mergedf(items) for items in weather_list]

I need to use each merged dataframe separate later on in my program.

Jesse
  • 1
  • 1

1 Answers1

1

Consider map for elementwise loop through your list of objects which you can upack with *.

data = [snow, temp, windspd]
data_day1 = [snow_day1, temp_day1, windspd_day1]
weather_list = [data, data_day1]

def proc_merge(left, right):
    return pd.merge(left, right, on=['Latitude', 'Longitude'])

df_list = list(map(proc_merge, *weather_list))

And to retain namings, consider building a dictionary of dataframes with keys and not unnamed list which you can iterate via a zip loop:

names = ['snow', 'temp', 'windspd']
df_dict = {nm: lst for nm, lst in zip(names, map(proc_merge, *weather_list))}

df_dict['snow']      # SINGLE DATAFRAME
df_dict['temp']      # SINGLE DATAFRAME
df_dict['windspd']   # SINGLE DATAFRAME

For more than two data frames, integrate the reduce with open-ended number of arguments:

data = [snow, temp, windspd]
data_day1 = [snow_day1, temp_day1, windspd_day1]
data_day2 = [snow_day2, temp_day2, windspd_day2]
data_day3 = [snow_day3, temp_day3, windspd_day3]

weather_list = [data, data_day1, data_day2, data_day3]

def proc_merge(*dfs):
    return reduce(lambda left, right: pd.merge(left, right, on=['Latitude', 'Longitude'], how ='outer'), dfs)

names = ['snow', 'temp', 'windspd']
df_dict = {nm: lst for nm, lst in zip(names, map(proc_merge, *weather_list))}
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Thanks, I got a new list of dataframes that are all merged which is what I'm looking. Is there anyway to do a merge inplace so I can have all the dataframes have the same name as the list? I need to use those dataframes with those names throughout the rest of my program. – Jesse Jul 09 '21 at 18:37
  • Hmmm....a Python list is unnamed unlike a dictionary with keys. Can you edit post with sample data to clarify what you mean? – Parfait Jul 09 '21 at 19:50
  • I have edited the post. Basically what I am after is data = merge of the 3 dataframes in the list. – Jesse Jul 09 '21 at 20:04
  • See revamped answer (now that I see actual data objects). Consider using dictionaries instead of lists for name keys. Otherwise, you would need to use the ill-advised `globals()` dictionary to save object names by string references. – Parfait Jul 09 '21 at 21:00