1

I am using apply to generate strings from a data frame.

For example:

df2 <- data.frame(a=c(1:3), b=c(9:11))
apply(df2, 1, function(row) paste0("hello", row['a']))
apply(df2, 1, function(row) paste0("hello", row['b']))

works as I would expect and generates

[1] "hello1" "hello2" "hello3"
[1] "hello9" "hello10" "hello11"

However, if I have

df <- data.frame(a=c(1:3), b=c(9:11), c=c("1", "2", "3"))
apply(df, 1, function(row) paste0("hello", row['a']))
apply(df, 1, function(row) paste0("hello", row['b']))

the output is

[1] "hello1" "hello2" "hello3"
[1] "hello 9" "hello10" "hello11"

Can any one please explain why I get a padded space to make all the strings the same length in the second case? I can work around the problem using gsub, but I would like to have a better understanding of why this happens

dillon
  • 268
  • 2
  • 8
  • 3
    take a look at what `as.matrix(df)` does (`apply` uses `as.matrix`) - http://stackoverflow.com/questions/15618527/why-does-as-matrix-add-extra-spaces-when-converting-numeric-to-character => fix with `apply(do.call(cbind, df), 1, function(row) { (str(row['b'])); paste0("hello", row['b']) })` – hrbrmstr Jul 08 '15 at 19:41

2 Answers2

1

You don't need apply function:

 paste0("hello", df[["a"]])
[1] "hello1" "hello2" "hello3"

paste0("hello", df[["b"]])
[1] "hello9"  "hello10" "hello11"
user227710
  • 3,164
  • 18
  • 35
0

This is happening because apply transforms your data.frame in a matrix. See what happens when you coerce df to matrix:

 as.matrix(df)
     a   b    c  
[1,] "1" " 9" "1"
[2,] "2" "10" "2"
[3,] "3" "11" "3"

Notice that it coerced to a character matrix and it included the extra space on the " 9".

Carlos Cinelli
  • 11,354
  • 9
  • 43
  • 66