I am currently working on a large data set containing about 9,000 observations belonging to different groups. Now, I would like to use a method called split-sample design to analyze this data. Let me explain in detail what I would like to do. My data has the following structure:
GroupID Performance Commitment Affect Size
1234 5 4 2 2
1234 6 8 9 2
2235 4 3 2 5
2235 4 3 2 5
2235 2 1 7 5
2235 2 1 7 5
2235 2 6 10 5
3678 3 5 5 4
3678 7 3 5 4
3678 5 2 6 4
3678 1 4 6 4
Now, I would like to aggregate this data in the following way: For each group, I would like to use the average performance score of the first half of the group and the average commitment and affect scores of the second half of the group to create one new observation (for uneven group sizes I would like to drop one random observation within the group - e.g. the last observation in a group - to create an even group size). However, I would like to do this in two steps. First, the data should look like this:
GroupID Performance Commitment Affect Size
1234 5 8 9 2
2235 4 1 7 5
2235 4 1 7 5
3678 3 2 6 4
3678 7 4 6 4
In the next step, I would like to aggregate the data. The new data set would have one observation per group and look like this:
GroupID Performance Commitment Affect Size
1234 5 8 9 2
2235 4 1 7 5
3678 5 3 6 4
Again, please note that the last observation of group 2235 was dropped, since the group size was an uneven number.
Is there any package out there that would split and aggregate my data in this way? If not, how would you go ahead and code this? I would be very grateful for any advice, since I have currently no idea how to elegantly approach this, other than writing a bunch of for
loops.
Here is the code for the above example:
groupid <- c(1234, 1234, 2235, 2235, 2235, 2235, 2235, 3678, 3678, 3678, 3678)
performance <- c(5, 6, 4, 4, 2, 2, 2, 3, 7, 5, 1)
commitment <- c(4, 8, 3, 3, 1, 1, 6, 5, 3, 2, 4)
affect <- c(2, 9, 2, 2, 7, 7, 10, 5, 5, 6, 6)
size <- c(2, 2, 5, 5, 5, 5, 5, 4, 4, 4, 4)
mydata <- data.frame(groupid, performance, commitment, affect, size)
Many thanks!!