1

I'm trying to add rows to a data frame within an sapply function call, but it's returning a matrix, whereas I want a data frame with variables in the columns and names/addresses in the rows.

This is a trivial example that demonstrates the problem. I'm doing it this way to avoid using a 'for' loop. Can someone please teach me how I should be doing this?

# initialize and empty data frame
absdf <- data.frame(
    name = character(),
    address = character(), 
    city = character(),
    stringsAsFactors=FALSE)

# a numeric vector. 
idvs <- c(123, 465)

print("initial empty data frame:")
print(absdf)

absdf <- sapply(idvs, function(idv) {

    name <- paste("John Doe", idv)
    address <- "123 Main St."
    city <- "Elmhurst"

    absdf <- rbind(absdf, data.frame(
        name = name, 
        address = address,
        city = city,
        stringsAsFactors=FALSE))

})

# print it out - a sanity check
print("absdf final:")
print(absdf)

Here's what it outputs:

[1] "initial empty data frame:"
[1] name    address city   
<0 rows> (or 0-length row.names)
[1] "absdf final:"
    [,1]           [,2]          
name    "John Doe 123" "John Doe 465"
address "123 Main St." "123 Main St."
city    "Elmhurst"     "Elmhurst"    

And finally, why is it a matrix?

> class(absdf)
[1] "matrix"
Frank
  • 66,179
  • 8
  • 96
  • 180
rstober
  • 1,231
  • 1
  • 10
  • 9

1 Answers1

9

sapply is attempting to simplify the result to matrix and you are not getting the output you expect.

From "simplify" parameter in apply:

logical or character string; should the result be simplified to a vector, matrix or higher dimensional array if possible?

Since sapply is a wrapper for lapply to simplify the output, try creating the data frames with lapply directly.

The popular function call do.call(rbind, <list>) combines the elements of a list.

absdf <- lapply(idvs, function(idv) {

    name <- paste("John Doe", idv)
    address <- "123 Main St."
    city <- "Elmhurst"

    data.frame(
        name = name, 
        address = address,
        city = city,
        stringsAsFactors=FALSE)

})
do.call(rbind, absdf)
#           name      address     city
# 1 John Doe 123 123 Main St. Elmhurst
# 2 John Doe 465 123 Main St. Elmhurst
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • @RichardScriven I think it's better to avoid the wrapper rather than use it and then cancel it's effect. – Pierre L Sep 29 '15 at 00:27
  • Better to just skip directly to data.frame, eh? `data.frame(name = paste("John Doe", idvs), address = "123 Main St.", city = "Elmhurst", stringsAsFactors=FALSE)` – Frank Sep 29 '15 at 00:29
  • @Frank I agree for this case, but the question is most likely over-simplified for demonstration of the effect described. – Pierre L Sep 29 '15 at 00:37
  • 1
    Thanks Pierre, that's the solution I was looking for! I will accept your answer when I figure out how to. And yes, the question is simplified version of the real problem. One note, I had to modify it slightly to : absdf <- do.call(rbind, absdf), which results in a data frame. – rstober Sep 29 '15 at 02:06