2

Assuming, I'm understanding the documentation of [[ correctly, a matrix can be used to subset a data.frame:

A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector. Negative indices are not allowed in the index matrix. NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.

While this works for [, I'm struggling to understand how to do this with [[.

mtcars[1:6, 1:6]
#>                    mpg cyl disp  hp drat    wt
#> Mazda RX4         21.0   6  160 110 3.90 2.620
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875
#> Datsun 710        22.8   4  108  93 3.85 2.320
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440
#> Valiant           18.1   6  225 105 2.76 3.460
(ind <- matrix(1:6, ncol = 2))
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6
mtcars[ind]
#> [1] 110.00   3.90   2.32
mtcars[[ind]]
#> Error in as.matrix(x)[[i]]: attempt to select more than one element in vectorIndex

Is this a bug? Or am I misinterpreting the documentation?

Here is the source of [[.data.frame (v3.6.1)

function (x, ..., exact = TRUE)
{
    na <- nargs() - !missing(exact)
    if (!all(names(sys.call()) %in% c("", "exact")))
        warning("named arguments other than 'exact' are discouraged")
    if (na < 3L)
        (function(x, i, exact) if (is.matrix(i))
            as.matrix(x)[[i]]
        else .subset2(x, i, exact = exact))(x, ..., exact = exact)
    else {
        col <- .subset2(x, ..2, exact = exact)
        i <- if (is.character(..1))
            pmatch(..1, row.names(x), duplicates.ok = TRUE)
        else ..1
        col[[i, exact = exact]]
    }
}
Stefan Bossbaly
  • 6,682
  • 9
  • 53
  • 82
nbenn
  • 591
  • 4
  • 12
  • **The [[ form allows only a single element to be selected using integer or character indices,....** – NelsonGon Oct 25 '19 at 14:47
  • 2
    @NelsonGon Could you please explain how my question is a duplicate of the linked question? I'm well aware of the general differences between `[` and `[[`. I'm asking, whether it's possible to subset a `data.frame` using `[[` and a matrix for indexing. Could you tell me where exactly this is answered in the linked question. I could not find anything relating to my exact question. – nbenn Oct 25 '19 at 14:56
  • Hello, sorry I normally link to a possible duplicate and if OP or someone else thinks it's a duplicate then it is closed. While you need to use a matrix for indexing, R's `[[` only supports single elements hence you can not use `[[` which is what the linked question states. You might conceptually map your matrix indices to `[[` and use that. I'm sorry if you found it offensive to close your well worded question. – NelsonGon Oct 25 '19 at 15:00
  • 1
    @NelsonGon I'm not offended. I simply don't see how you are helping my question. Your only argument is a sentence from R-lang, which I don't see as an answer. In some sense, what I propose is exactly what R-lang says on the topic: I'm selecting elements using numeric indices. – nbenn Oct 25 '19 at 15:13
  • Alright, I might be wrong but subsetting `1:6` and `1:6` is different from a matrix built with `1:6` and 2 columns. In any case, what is the actual use case like? I mean there are several alternative ways to subset. – NelsonGon Oct 25 '19 at 15:22
  • @NelsonGon could you please undo the closing of my question? Clearly you are wrong in claiming that there is no provision for subsetting a `data.frame` with a matrix using `[[`. – nbenn Oct 25 '19 at 15:32
  • I voted to reopen. I think it needs some votes, hopefully someone with more "power" can reopen it. Frankly I sometimes don't vote on duplicates to avoid issues such as this. I don't want to be right or wrong, I actually prefer being wrong since I'm here to learn. – NelsonGon Oct 25 '19 at 15:33
  • @NelsonGon Thank you. – nbenn Oct 25 '19 at 15:34

1 Answers1

1

The doc page (?Extract) you reference says that arrays can be indexed by matrices. Implicitly, I take that to mean non-arrays cannot be indexed by matrices. Data frames are not arrays, so they cannot be indexed by matrices. (Matrices are arrays, of course.)


I do think you're misinterpreting the documentation. You're looking at a documentation page that jointly documents [, [[, and $, together. In the argument description, it says

When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x...

The section you quote at the top of your question comes later on, under the heading Matrices and Arrays, which I take to be a section about subsetting matrices and arrays, not about using matrices as indices. (Look at the rest of the section, and the sections before and after, and I think you'll agree with me.)

Nowhere on that documentation page does it talk about using matrices as indices for [[.

I'm surprised it's handled specially in the [[ code you show - but near as I can tell, a matrix given to [[.data.frame will error out unless it's a 1x1 matrix, in which case the data frame is treated as a matrix and the single element is returned, for some arcane reason (probably "compatability with S", though I've no good guess as to why S would allow it).

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294