Concrete examples on why dimension is not defined for vectors (vectors are dimensionless) in R?

Question

Here: in R, to arise the need to define dimension for a vector,

M. JORGENSEN (Dept of Stat, U of Waikato, NZ):
"Would it not make sense to have dim(A)=length(A) for all vectors?"

B.D. RIPLEY (Dept of Applied Statistics, Oxford, UK):
"No. A one-dimensional array and a vector are not the same thing. There are subtle differences, such as what names() means (see ?names).

That a 1D array and a vector print in the same way does occasionally lead to confusion, but then you also cannot tell from your printout that A has type integer and not double.
......
My question:
(1) Not only I cannot figure out the subtle difference on names() but also
(2) I cannot produce a concrete example about "telling from the printout that A has type integer and not double issue".

Any help to clarify JORGENSEN-RIPLEY discussion (with concrete examples in R) will be appreciated.

Mikko Marttila · Accepted Answer · 2018-07-09T14:01:45.720

To address the first question, let's first create a vector and a 1-d array:

(vector <- 1:10)
#>  [1]  1  2  3  4  5  6  7  8  9 10

(arr_1d <- array(1:10, dim = 10))
#>  [1]  1  2  3  4  5  6  7  8  9 10

If we give the objects some names, we can see the difference that Ripley alludes to by looking at the attributes:

names(vector) <- letters[1:10]
names(arr_1d) <- letters[1:10]

attributes(vector)
#> $names
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
attributes(arr_1d)
#> $dim
#> [1] 10
#> 
#> $dimnames
#> $dimnames[[1]]
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"

That is, the 1-d array doesn't actually have a names attribute, but rather a dimnames attribute (which is a list, not a vector), the first element of which names() actually accesses.

This is covered in the "Note" section in ?names:

For vectors, the names are one of the attributes with restrictions on the possible values. For pairlists, the names are the tags and converted to and from a character vector.

For a one-dimensional array the names attribute really is dimnames[[1]].

Here we also see the lack of a dim attribute for vectors. (A related SO answer covers the differences between arrays and vectors, too.)

The additional attributes and their storage method means that 1-d arrays always take up a little more memory than their vector equivalent:

# devtools::install_github("r-lib/lobstr")
lobstr::obj_size(vector)
#> 848 B
lobstr::obj_size(arr_1d)
#> 1,056 B

However, that's about the only reason I can think of why one would want to have separate types for vectors and 1-d arrays. I would assume this was really the question that Jorgensen was asking, i.e. why have a separate vector type without the dim attribute at all; and I don't think Ripley really addresses that. I'd be very interested to hear other rationale for this.

As for point 2), when you create a vector with : it is always an integer:

vector <- 1:10
typeof(vector)
#> [1] "integer"

A double with the same values will print the same:

double <- as.numeric(vector)
typeof(double)
#> [1] "double"

double
#>  [1]  1  2  3  4  5  6  7  8  9 10

But integers and doubles are not the same thing:

identical(vector, double)
#> [1] FALSE

The differences between integers and doubles in R are subtle, the main one being that integers take up less space in memory.

lobstr::obj_size(vector)
#> 88 B
lobstr::obj_size(double)
#> 168 B

See this answer for a more comprehensive overview of the differences between integers and doubles.

Created on 2018-07-09 by the reprex package (v0.2.0.9000).

Concrete examples on why dimension is not defined for vectors (vectors are dimensionless) in R?

1 Answers1