The apply()
function coerces its first argument to a matrix before calling the function on each column. So your data frames are coerced to matrix objects. A consequence of that conversion is that as.matrix(df_AB)
has non-null rownames, while as.matrix(df_ab)
does not:
> str(as.matrix(df_ab))
int [1:5, 1:2] 1 2 3 4 5 6 5 4 3 2
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "a" "b"
> str(as.matrix(df_AB))
int [1:5, 1:2] 1 2 3 4 5 6 5 4 3 2
- attr(*, "dimnames")=List of 2
..$ : chr [1:5] "1" "2" "3" "4" ...
..$ : chr [1:2] "a" "b"
So when you apply()
subset a column of df_AB
, you get a named vector, which is not identical to an unnamed vector.
apply(df_AB, 2, str)
Named int [1:5] 1 2 3 4 5
- attr(*, "names")= chr [1:5] "1" "2" "3" "4" ...
Named int [1:5] 6 5 4 3 2
- attr(*, "names")= chr [1:5] "1" "2" "3" "4" ...
NULL
Contrast that with the subset()
function, which selects rows using a logical vector for the value of i
. And it looks like subsetting a data.frame with a non-missing value for i
causes this difference in the row.names
attribute:
> str(as.matrix(df[1:5, 1:2]))
int [1:5, 1:2] 1 2 3 4 5 6 5 4 3 2
- attr(*, "dimnames")=List of 2
..$ : chr [1:5] "1" "2" "3" "4" ...
..$ : chr [1:2] "a" "b"
> str(as.matrix(df[, 1:2]))
int [1:5, 1:2] 1 2 3 4 5 6 5 4 3 2
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "a" "b"
You can see the all the gory details of the difference between the data.frames using the .Internal(inspect(x))
function. You can look at those yourself, if you're interested.
As Roland pointed out in his comments, you can use the .row_names_info()
function to see the differences in only the row names.
Notice that when i
is missing, the result of .row_names_info()
is negative, but it is positive if you subset with a non-missing i
.
> .row_names_info(df_ab, type=1)
[1] -5
> .row_names_info(df_AB, type=1)
[1] 5
What these values mean is explained in ?.row_names_info
:
type: integer. Currently ‘type = 0’ returns the internal
‘"row.names"’ attribute (possibly ‘NULL’), ‘type = 2’ the
number of rows implied by the attribute, and ‘type = 1’ the
latter with a negative sign for ‘automatic’ row names.