I'm trying to merging the panel dataframes by idcode and do a sorting process by 'idcode' in my dataframes.
My data variable names are wave68, wave69...., wave71.
Overall, I have two problems:
First, I want to sort the dataframes by using a loop, but I don't know how to assign the looping file names, i.e.
wave+i
?Second, I don't know how to make the loop algorithm to merge the dataframes correctly.
The final result that I want is wide form Panel data sorted by 'idcode' by wave68,69,70.. by using loop.
import pandas as pd
import numpy as np
wave68 = pd.read_csv('panel_data/wave68.csv')
wave69 = pd.read_csv('panel_data/wave69.csv')
wave70 = pd.read_csv('panel_data/wave70.csv')
wave71 = pd.read_csv('panel_data/wave71.csv')
df = [wave68,wave69,wave70,wave71]
def my_sorter(file_name,var):
for i in file_name:
file_name[i].sort_values(by=[var])
wave68 = wave68.sort_values(by=['idcode'])
wave69 = wave69.sort_values(by=['idcode'])
wave70 = wave70.sort_values(by=['idcode'])
merged = pd.merge(wave68, wave69, on='idcode')
merged = pd.merge(merged, wave70, on='idcode')
merged = pd.merge(merged, wave71, on='idcode')
merged.head(20)