2

I am having a hard time understanding the difference between a double bracket subset and a single bracket subset.

I am fairly new in open source programming and I am having a hard time in understanding the ?help function in R because some of the information there is too technical for me given my current understanding of R at this time. I have tried googling the difference and though it gives me an idea, I still don't completely understand the difference, especially in this example I will use below.

What I'm trying to understand is how the double bracket subset is used in this specific code:

tmp <- vector(mode = "list", length = 10)
listall <- list.files("specdata", full.names = TRUE)
tmp[[1]] <- read.csv(listall[[1]])

listall[1] contains the following data frame:

  Date       sulfate nitrate ID
1 1/1/2003       1      10   1
2 1/2/2003       2      11   1
3 1/3/2003       3      12   1
4 1/4/2003       4      13   1
5 1/5/2003       5      14   1
6 1/6/2003       6      15   1
7 1/7/2003       7      16   1
8 1/8/2003       8      17   1
9 1/9/2003       9      18   1

And true enough, following the double bracket subsetting code above will put the data frame inside the first slot of list tmp[1].

Why do I have to do the double bracket subset to put that data frame in? Won't a single bracket subset will do?

 tmp[1] <- read.csv(listall[1])

Running this code will produce a warning message:

Warning message:
In tmp[1] <- read.csv(listall[1]) :
number of items to replace is not a multiple of replacement length

and running tmp[1] will produce a mixed up data frame like below:

[[1]]
[1] 1/1/2003 1/2/2003 1/3/2003 1/4/2003 1/5/2003 1/6/2003 1/7/2003 1/8/2003 1/9/2003
Levels: 1/1/2003 1/2/2003 1/3/2003 1/4/2003 1/5/2003 1/6/2003 1/7/2003 1/8/2003 1/9/2003

Can someone please explain why it shows a warning message like that and why the data frame is all mixed up?

From my limited understanding in subsetting. Assigning read.csv(listall[1]) subsets the DATA FRAME to tmp[1], which is the first slot of the list.

Why does this require double bracket subsetting?

Kevin Yang
  • 21
  • 3
  • You have two different cases. (1) `listall` is an atomic vector, in which case the two forms are equivalent. (2) `tmp` is a `list` and `[` extracts *sub-lists* while `[[` extracts *elements* of the list. You want to assign the output of `read.csv` to an element of the list `tmp` so you must use the latter. (I could only understand the difference when trying both with data.frames, which are special cases of lists.) – Rui Barradas Oct 03 '17 at 09:32

0 Answers0