This takes a bit to explain and the post itself may be a bit too long to be answered.
I have MANY data frames of individual chess players and their specific ratings at points in time.
Here is what my data looks like. Please forgive me for my poor formatting of separating the datasets. Carlsen and Nakamura are separate dataframes.
Player1
Nakamura, Hikaru Year
2364 2001-01-01
2430 2002-01-01
2520 2003-01-01
2571 2004-01-01
2613 2005-01-01
2644 2006-01-01
2651 2007-01-01
2670 2008-01-01
2699 2009-01-01
2708 2010-01-01
2751 2011-01-01
2759 2012-01-01
2769 2013-01-01
2789 2014-01-01
2776 2015-01-01
2787 2016-01-01
Player2
Carlsen, Magnus Year
2127 2002-01-01
2279 2003-01-01
2484 2004-01-01
2553 2005-01-01
2625 2006-01-01
2690 2007-01-01
2733 2008-01-01
2776 2009-01-01
2810 2010-01-01
2814 2011-01-01
2835 2012-01-01
2861 2013-01-01
2872 2014-01-01
2862 2015-01-01
2844 2016-01-01
You can download the two sets here:
Download Player2 Download Player1
Between the above code, and below, Ive deleted two columns and reassigned an observation as a column title.
Hikaru Nakamura/Magnus Carlsen's chess rating over time
Hikaru's data is assigned to a dataframe, Player1. Magnus's data is assigned to a dataframe, Player2.
What I want to be able to do is get what you see below, a dataframe of them combined.
The code I used to produce this frame is
merged<- merge(Player1, Player2, by = c("Year"), all = TRUE)
Now, this is all fun and dandy for two data sets, but I am having very annoying difficulties to add more players to this combined data set.
For example, maybe I would like to add 5, 10, 15 more players to this set. Examples of these players would be Kramnik, Anand, Gelfand ( Examples of famous chess players). As you'd expect, for 5 players, the dataframe would have 6 columns, 10 would have 11, 15 would have 16, all ordered nicely by the Year variable.
Fortunately, the number of observations for each Player is less than 100 always. Also, each individual player is assigned his/her own dataset.
For example,
Nakamura is the Player1 dataframe
Carlsen is the Player2 dataframe
Kramnik is the Player3 dataframe
Anand is the Player4 dataframe
Gelfand is the Player5 dataframe
all of which I have created using a for loop assigning process using this code
for (i in 1:nrow(as.data.frame(unique(Timed_set_filtered$Name)))) {
assign(paste("Player",i,sep=""), subset(Timed_set_filtered, Name == unique(Timed_set_filtered$Name)[i]))
}
I don't want to write out something like below:
merged<- merge(Player1, Player2,.....Player99 ,Player100, by = c("Year"), all = TRUE)
I want to able to merge all 5, 10, 15...i number of Player"i" objects that I created in the loop together by Year.
Also, once it leaves the loop initially, each dataset looks like this.
So what ends up happening is that I assign all of the data sets to a list by using the following snippet:
lst <- mget(ls(pattern='^Player\\d+'))
list2env(lapply(lst,`[`,-2), envir =.GlobalEnv)
lst <- mget(ls(pattern='^Player\\d+'))
for (i in 1:nrow(as.data.frame(unique(Timed_set_filtered$Name)))) {
names(lst[[i]]) [names(lst[[i]]) == 'Rating'] <- eval(unique(Timed_set_filtered$Name)[i])
}
This is what my list looks like.
Is there a way I write a table with YEAR as the way its merged by, so that it[cbinds, bind_cols, merges, etc] each of the Player"i" dataframes, which are necessarily not equal in length , in my lists are such a way that I get a combined/merged set like the one you saw below the merged(player1, player2) set?
Here is the diagram again, but it would have to be for many players, not just Carlsen and Nakmura.
Also, is there a way I can avoid using the list function, and just straight up do
names(Player"i") [names(Player"i") == 'Rating'] <- eval(unique(Timed_set_filtered$Name)[i])
which just renames the titles of all of the dataframes that start with "Player".
merge(player1, player2, player3,...., player99, player100, by = c("YEAR"), all = TRUE)
which would merge all of the "Player""i" datasets?
If anything is unclear, please mention it.