11

So, I have a list of data frames, named as "D1.txt", "D2.txt"................"D45.txt". Each of the file contains2 columns and each file has 1000's of rows`.

I am trying to add a new column to each of the data frame in the list, by the following code, but it shows the error as incorrect number of subscripts on matrix.

The code that I am using is,

L <- lapply(seq_along(L), function(i) { 
    L[[i]][, paste0('DF', i)] <-  1
    L[[i]] 
})

where L is the name of the list containing data frames.

Why is this error coming? Thanks! :)

Edit: A reproducible example:

# Create dummy data
L <- replicate(5, expand.grid(1:10, 1:10)[sample(100, 10), ], simplify=FALSE)

# Add a column to each data.frame in L. 
# This will indicate presence of the pair when we merge.
L <- lapply(seq_along(L), function(i) { 
  L[[i]][, paste0('DF', i)] <-  1
  L[[i]] 
})
user3778242
  • 183
  • 1
  • 3
  • 11

2 Answers2

9

I think that when you read in your "D1.txt", "D2.txt"................"D45.txt" files they get converted to matrices and that is why your particular for loop fails. I'll use your example:

L <- replicate(5, expand.grid(1:10, 1:10)[sample(100, 10), ], simplify=FALSE)

If we use class(L[[1]]) to pick out the first element of the list it will output [1] "data.frame" if you use your for loop on this list that only contains data.frames you will see no error and it will give you what you want. If however we transform all elements in the list to matrices:

for(i in seq_along(L)){
     L[[i]] <- as.matrix(L[[i]])
}

and check with class(L[[1]]) it will output [1] "matrix". If you use your for loop now on L which now contains matrices we will get:

> L <- lapply(seq_along(L), function(i) { 
+   L[[i]][, paste0('DF', i)] <-  1
+     L[[i]] 
+     })
Error in `[<-`(`*tmp*`, , paste0("DF", i), value = 1) : 
  subscript out of bounds

Hence, you can either make sure that when you read in your files they are coerced to data.frames, use @Richards solution, or read in your files and coerce them to data.frames via

 for(i in seq_along(L)){
    L[[i]] <- as.data.frame(L[[i]])
}

and use your for loop.

lord.garbage
  • 5,884
  • 5
  • 36
  • 55
  • I think the matrix coercion comes from the line `L[[i]][, paste0('DF', i)] <- 1`. It should be `L[[i]][paste0('DF', i)] <- 1` since all `L` elements are data frames. Look at the result of `sapply(L, class)` – Rich Scriven Aug 18 '14 at 15:06
2

Here's a small example on how to add columns to data frames stored in a list. Use [<- with your lapply call to assign the new column. Here I add the column "newCol" which contains the values 10 and 11, to each data frame in lst

> lst <- list(a = data.frame(x = 1:2), b = data.frame(y =3:4))
> lapply(lst, `[<-`, ,'newCol', 10:11)
# $a
# x  newCol
# 1 1      10
# 2 2      11
# 
# $b
# y  newCol
# 1 3      10
# 2 4      11
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245