In Python, I am working with longitudinal school data, and have 6 subsets of data, each with the same 4 years of school data (4 files) and the same students for the most part. Each subset represents something different such as standardized test scores, attendance data, etc.
What I want to do is merge them into 1 big file where each student ID is preferably stacked by year and has columns from all of the subsets. For example, let's say a students ID number is 123456, I would want the big data set to look like:
Student ID Year Test Score Days Absent...
123456 2016 97 10
123456 2017 91 14
123456 2018 94 16
Let's say one of the subsets is called "test scores", and in that are 4 files titled 2016, 2017, 2018, and 2019. How would I merge those 4 files together so that they are stacked based on the student id number for each school year like how it is above?
And after I merge the files of that subset, let's say there's another subset called "achievement" which is measure of teacher evaluation on students. One of those variables is the same student id, and another variable is called grade level. How would I go about then merging in the grade level column based on student ID number into the merged test scores file so that the students in the test scores merged dataset now have a grade level associated with them?
Thanks!