If the data is stored in an object of type matrix()
, and subsequent operations against the matrix use row and column references instead of named columns, the original answer works fine.
We'll generate a matrix of data, rename the columns, and display the matrix. set.seed()
is used to ensure reproducibility of the runif()
function.
set.seed(3104)
nameList <- c('Jan.94','Feb.94','Mar.94',
'Jan.94.x','Feb.94.x','Mar.94.x',
'Jan.94.x.x','Feb.94.x.x','Mar.94.x.x')
x <- matrix(runif(90),nrow=10,ncol=9)
colnames(x) <- gsub(".x","",nameList,fixed=TRUE)
head(x)
...and the output:
> head(x)
Jan.94 Feb.94 Mar.94 Jan.94 Feb.94 Mar.94 Jan.94
[1,] 0.73967666 0.3950552 0.4593954 0.5246329 0.9318526 0.97022213 0.51974938
[2,] 0.78333764 0.8019435 0.3277070 0.8342044 0.9564895 0.31632572 0.02162478
[3,] 0.07161414 0.3681912 0.5151378 0.8647585 0.9841725 0.69784065 0.05600622
[4,] 0.92636930 0.6643402 0.2357173 0.6178838 0.5324841 0.42694750 0.13356315
[5,] 0.26566868 0.7210794 0.6275253 0.9630575 0.5757118 0.63363792 0.30718159
[6,] 0.57439103 0.1076186 0.8501558 0.0615584 0.3375161 0.06738025 0.25910038
Feb.94 Mar.94
[1,] 0.82225954 0.94697173
[2,] 0.03341796 0.08548795
[3,] 0.99208753 0.37739177
[4,] 0.85306984 0.00283353
[5,] 0.61724901 0.16111121
[6,] 0.21789765 0.07376294
However, if one needs to access the columns in an object of type data.frame()
with the $
form of the extract operator, one gets unexpected results when multiple columns have the same column name.
# use with data.frame() introduces subtle defect
# when using the $ form of the extract operator
set.seed(3104)
x <- data.frame(matrix(runif(90),nrow=10,ncol=9))
colnames(x) <- gsub(".x","",nameList,fixed=TRUE)
# extract only retrieves the first column named Jan.94
x$Jan.94
...and the output:
> x$Jan.94
[1] 0.73967666 0.78333764 0.07161414 0.92636930 0.26566868 0.57439103
[7] 0.60409610 0.10018717 0.67436946 0.90823532
>
Creating a data.frame()
with multiple columns having the same column name causes the $
form of the extract operator to be unable to access many of the columns in the data frame.
That said, it is possible to extract multiple columns with the same name from a data frame, but it takes a bit more effort.
head(x[,grepl("Jan.94",colnames(x))])
...and the result:
> head(x[,grepl("Jan.94",colnames(x))])
Jan.94 Jan.94.1 Jan.94.2
1 0.73967666 0.5246329 0.51974938
2 0.78333764 0.8342044 0.02162478
3 0.07161414 0.8647585 0.05600622
4 0.92636930 0.6178838 0.13356315
5 0.26566868 0.9630575 0.30718159
6 0.57439103 0.0615584 0.25910038
>