4

In the programming language R what, precisely, is the meaning of

'['

which serves as a parameter to sapply() and lapply() in the following portion of code:

dd <- data.frame(
    A = c(1L, 2L, 3L), 
    B = c(4L, 5L, 6L), 
    C = c("X1=7;X2=8;X3=9",
          "X1=13;X2=14",
          "X1=5;X2=1;X3=8")
)
namev <- function(x) {
    a <- strsplit(x,"=")
    setNames(sapply(a,'[',2), sapply(a,'[',1))
}

vv <- lapply(strsplit(as.character(dd$C),";"), namev)

nm <- unique(unlist(sapply(vv, names)))

#extract data from all rows for every column
nv <- do.call(rbind, lapply(vv, '[', nm))

dd$C [1] X1=7;X2=8;X3=9 X1;; X1=13;X2=14
Levels: X1;; X1=13;X2=14 X1=7;X2=8;X3=9

@Henrik The answer to the two answers are the same but the questions are different. The question for which this has been marked duplicate (Using '[' square bracket as a function for lapply in R ) presupposes a knowledge that [ is a function which is not self evident to us R newbies.

BigAl_LBL
  • 65
  • 6
  • Can you provide a simple example of what dd$C would look like? – Melissa Key Apr 09 '18 at 16:05
  • 1
    Just as an addition to the answers, one can access the documentation for the function `[` by typing `?'['` Though it takes some time to fully understand all the nuances discussed in there. – R.S. Apr 09 '18 at 16:18
  • Yes I tried the help. It failed to clarify what was going on in the above example but has been most interesting especially in the light of the explanations in the answers below. – BigAl_LBL Apr 09 '18 at 18:17

4 Answers4

5

[ is a function. In the examples below it is used with two arguments.

L <- list(a = 1:4, b = 1:3)

sapply(L, `[`, 2)
## a b 
## 2 2 

The above sapply is the same as either of these:

sapply(L, function(x) `[`(x, 2))

sapply(L, function(x) x[2])

It is a primitive function in R whose R source is the following, i.e. it punts to the underlying C code.

`[`
## .Primitive("[")

S3 methods can be written for it. For example these methods are available in vanilla R.

> methods("[")
 [1] [,nonStructure-method [.acf*                [.AsIs               
 [4] [.bibentry*           [.data.frame          [.Date               
 [7] [.difftime            [.Dlist               [.factor             
[10] [.formula*            [.getAnywhere*        [.hexmode            
[13] [.listof              [.noquote             [.numeric_version    
[16] [.octmode             [.pdf_doc*            [.person*            
[19] [.POSIXct             [.POSIXlt             [.raster*            
[22] [.roman*              [.SavedPlots*         [.simple.list        
[25] [.table               [.terms*              [.ts*                
[28] [.tskernel*           [.warnings           
see '?methods' for accessing help and source code

For example, try the following to see the R source code for these methods:

`[.data.frame`

`[.Date`
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
3

[ is a function.

iris[1,2] is the equivalent of '['(iris,1,2).

It needs to be quoted to be used this way as it's not a syntactically valid name (see ?make.names).

You could quote any function though :

'head'(iris)

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

FYI the magrittr package includes the extract and extract2 functions that are identical to functions [ and [[ but might be more readable for some (and one can use them without the quotes).

[<- and [[<- are functions too, that you use when assigning to an element of a vector/matrix/data.frame/list and have aliases inset and inset2 in magrittr

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
2

It refers to the function behind the index operation you would perform with two brackets : x[3] for example is really just a function call "["(x, 3).

Jordi
  • 1,313
  • 8
  • 13
0

Pretty sure this is a duplicate. Either way, it functions as a subset function. i.e.:

a <- list(c(2,5),c(24,4),c(15,3))
lapply(a,'[',2)

Will return a list containing 5,4 and 3.

iliupersis
  • 61
  • 6