3

My question stems from the usage of [[ and ]] in user created functions to reference list elements. From what I can tell, [[ and ]] work the same way as [ and ] when applied to vectors.

Is this true of all other list operations though? As another example, I can use lapply on a vector.

It makes sense that this is true if a list is just a generalised vector, whose entries can be of differing modes.

Alex
  • 15,186
  • 15
  • 73
  • 127
  • The answer you referenced is great at explaining the difference between `[]` and `[[]]`, but it does not tell me anything about the ideological relationship betweens lists and vectors. Also, the title of the question you referenced makes no mention at all of vectors. – Alex May 02 '14 at 04:54
  • Perhaps a better phrasing of my question is "Can any function written for a list accept a vector instead?" – Alex May 02 '14 at 04:58
  • Ah, I see what you're asking now, thats actually a good question. – Scott Ritchie May 02 '14 at 05:10
  • Strictly speaking, the answer is "no", the obvious example being functions that assign complex data types to elements of the list. – Scott Ritchie May 02 '14 at 05:14
  • That is indeed a good counterexample. What about functions that do not make assignments to the list? To give some more context, I am writing a function using `[[]]` instead of `[]` as I do not want it to break when a list is parsed to the function. – Alex May 02 '14 at 05:16

2 Answers2

2

EDIT: The one-and-a-half line answer is that both lists and atomic vectors are types of vectors, and subset exactly the same way.

This answer expands on the difference between lists and atomic vectors.

The best explanation of R's data structures, specifically between lists and atomic vectors, is (in my opinion) Hadley Wickham's new book: http://adv-r.had.co.nz/Data-structures.html

Both lists and atomic vectors are 1 dimensional data structures. However, atomic vectors are homogeneous and lists are heterogeneous. Lists can contain any type of vector, including other lists. Atomic vectors are flat on the other hand.

As far as subsetting using [] vs [[]], [] is preserving for both lists and atomic vectors, where as [[]] is simplifying. Thus, [] and [[]] are NOT the same, whether applied to lists OR atomic vectors. For example, [[]] will simplify a named vector by removing the name; subsetting a named vector by [] will keep the name. For a list, [[]] will pull out the contents of a list, and can return a number of simplified data structures. Subsetting a list by [] will always return a list (preserving).

Subsetting an atomic vector by [[]] returns a length one atomic vector. Subsetting a list by [[]] can return a number of different classes of data structures. This goes back to the fact that atomic vectors are homogeneous and lists are heterogeneous. However, according to Hadley, subsetting a list works exactly the same way as subsetting an atomic vector.

Take a look at this section of Hadley's book for further reference: http://adv-r.had.co.nz/Subsetting.html#subsetting-operators

Ben Rollert
  • 1,564
  • 1
  • 13
  • 21
  • I will disagree about the stripping of names: `[[` and `[` do the same for both lists and vectors. `[[` on a list will also strip the name of that element (if the list is named). – Scott Ritchie May 02 '14 at 05:24
  • You're correct - my point is that `[]` and `[[]]` are different. The OP stated that "[[ and ]] work the same way as [ and ] when applied to vectors." – Ben Rollert May 02 '14 at 05:27
  • In any case, Hadley really has a thorough discussion on this. I was almost tempted to just post a link to his book and give him the credit. – Ben Rollert May 02 '14 at 05:30
  • 1
    Right, Hadley does a great job of explaining the differences. I think what the OP is trying to get at is, does the heterogenous vs. homogenous difference make any practical difference beyond the data types that can be stored by each. – Scott Ritchie May 02 '14 at 05:58
  • The one line answer is that both atomic vectors and lists are vectors, and subset the same way. – Ben Rollert May 02 '14 at 06:04
  • One of the important points you didn't mention in the body of this answer is that `[[` can never select more than one index, e.g. `x[[1:3]]` , unlike `[`. So whilst stating that subsetting an atomic vector by `[[` returns an atomic vector is technically correct, you can only ever return a length one atomic vector in this way. – Simon O'Hanlon May 02 '14 at 07:24
  • 2
    `[[` can (is meant to) select just one element, but there can be more than one indexes in case of nested lists. `list(list(1, list(1,2,3)))[[1:3]]` – lebatsnok May 02 '14 at 08:12
  • There are other differences as well, such as when an index is out of bounds. – Ben Rollert May 02 '14 at 08:36
  • 2
    @lebatsnok that's interesting. I haven't seen `[[` used like that before. I would have ordinarily used `x[[1]][[2]][[3]]`. Nice shorthand, though should be careful when returning to code written like this. I feel it could be easy to get confused! – Simon O'Hanlon May 02 '14 at 09:26
1

Since I wasn't able to come up with any more counter examples, I referred to the documentation on R's internals, and it appears your intuition is correct.

If you look at the section on the underlying structure of R's data structures in C, SEXPTYPEs, lists are implied to be generic vectors:

19 VECSXP list (generic vector)

Scott Ritchie
  • 10,293
  • 3
  • 28
  • 64