You said in the comments that you'd figured this out, but this was a useful practice problem for me, so I thought I would go ahead and post an answer in case it's useful to others, and, hopefully, to find out if there are more efficient ways to do this.
I did two versions: one that assumes the sets of columns are different observations of the same thing,and another that assumes the sets of columns represent different groups that you're observing with the same measures.
Here's the data I used; the last row includes different dates to make sure the process is resilient to that situation:
Date PX_SETTLE CHG_NET_1D Date.1 PX_SETTLE.1 CHG_NET_1D.1
12/7/1988 91.07 0.03 12/7/1988 90.93 -0.02
12/8/1988 91.09 0.02 12/8/1988 NA -0.02
12/9/1988 91.1 0.01 12/9/1988 90.88 NA
12/12/1988 91.1 0 12/12/1988 90.86 -0.02
12/15/1988 91.12 0.02 12/15/1988 90.76 -0.1
12/16/1988 91.2 0.08 12/20/1988 90.96 0.20
Multiple measures of single entity
This version assumes that the sets of columns are different observations of the same thing (e.g., prices of a particular commodity on particular date across multiple exchanges), so in the end you want just one row per date with columns indexed by sets of observations (e.g., sources).
# If necessary, add an index value to first block of columns in df; see http://stackoverflow.com/questions/8898501/grepl-search-within-a-string-that-does-not-contain-a-pattern
names(df) <- c(paste0(names(df)[!grepl("\\.", names(df))], ".0"), names(df[grepl("\\.", names(df))])
# Make vector that will serve as index to columns in df
indx <- gsub(".*\\.", "", names(df))
# create separate dfs in workspace based on that index; see http://stackoverflow.com/questions/27501615/splitting-a-dataframe-by-column-name-indices
list2env(
setNames(
lapply(split(colnames(df), indx), function(x) df[x]),
paste('df', sort(unique(indx)), sep="_")),
envir=.GlobalEnv)
# Create function to scrub each df by simplifying Date name and setting is as key
fixit <- function(dataframe) {
require(data.table)
colnames(dataframe) <- sub("Date.*", "Date", colnames(dataframe))
dataframe$Date <- as.Date(dataframe$Date, format="%m/%d/%Y")
DT <- setkey(data.table(dataframe), Date)
return(DT)
}
# Make a list of data frames in the workspace with names that fit the pattern created a couple of steps above
dflist <- Filter(function(x) is(x, "data.frame"), mget(ls(pattern = "_[0-9]")))
# Apply the scrubbing function to the items in that list, returning scrubbed list
newdflist <- lapply(dflist, fixit)
# Merge the elements of that new list, keeping all values and keying on Date
newdf <- Reduce(function(...) merge(..., all=TRUE), newdflist)
Here's what the result of that process looks like. The suffixes on the non-date variables denote the different sets of observations:
> newdf
Date PX_SETTLE.0 CHG_NET_1D.0 PX_SETTLE.1 CHG_NET_1D.1
1: 1988-12-07 91.07 0.03 90.93 -0.02
2: 1988-12-08 91.09 0.02 NA -0.02
3: 1988-12-09 91.10 0.01 90.88 NA
4: 1988-12-12 91.10 0.00 90.86 -0.02
5: 1988-12-15 91.12 0.02 90.76 -0.10
6: 1988-12-16 91.20 0.08 NA NA
7: 1988-12-20 NA NA 90.96 0.20
Common measures of multiple entities
Now let's assume that each block of columns represents a distinct entity, like a ticker symbol, measured the same way on overlapping sets of dates. Here we'll repeat most of the process above, except that now the scrubbing function won't bother with keying the list elements, because we want to allow them to repeat (across groups). Instead, the function will add an indexing variable and simplify all the variable names across list elements to a common format to facilitate the merging that comes at the end.
names(df) <- c(paste0(names(df)[!grepl("\\.", names(df))], ".0"), names(df[grepl("\\.", names(df))])
indx <- gsub(".*\\.", "", names(df))
list2env(
setNames(
lapply(split(colnames(df), indx), function(x) df[x]),
paste('df', sort(unique(indx)), sep="_")),
envir=.GlobalEnv)
dflist <- Filter(function(x) is(x, "data.frame"), mget(ls(pattern = "_[0-9]")))
fixit2 <- function(dataframe) {
z <- dataframe
z[,"index"] <- unlist(strsplit(colnames(z)[1], "\\."))[2]
colnames(z) <- sub("\\.[0-9]", "", colnames(z))
return(z)
}
newdflist2 <- lapply(dflist, fixit2)
newdf2 <- Reduce(function(...) merge(..., all=TRUE), newdflist2)
newdf2 <- newdf2[order(newdf2$index, newdf2$Date),]
And here's what the output from that looks like:
> newdf2
Date PX_SETTLE CHG_NET_1D index
2 1988-12-07 91.07 0.03 0
3 1988-12-08 91.09 0.02 0
6 1988-12-09 91.10 0.01 0
8 1988-12-12 91.10 0.00 0
10 1988-12-15 91.12 0.02 0
11 1988-12-16 91.20 0.08 0
1 1988-12-07 90.93 -0.02 1
4 1988-12-08 NA -0.02 1
5 1988-12-09 90.88 NA 1
7 1988-12-12 90.86 -0.02 1
9 1988-12-15 90.76 -0.10 1
12 1988-12-20 90.96 0.20 1
Both approaches could be simplified with piping or nesting some of the later steps.