1

Imagine I have the following stacked data matrix:

mY <- data.frame(matrix(c(c(1:10),c("A 1","A 1","A 1","A 1","A 1","B 1","B 1","B 1","B 1","B 1")),10))

Resulting in:

   X1 X2
1   1 A 1
2   2 A 1
3   3 A 1
4   4 A 1
5   5 A 1
6   6 B 1
7   7 B 1
8   8 B 1
9   9 B 1
10 10 B 1

This is just an example of a data frame which I want to unstack, where entries in X2 contain a space character. It could also have been 'hot dog', or 'boiled egg'.

When I use

mB <- unstack(mY, X1~X2)

I get

  A.1 B.1
1   1   6
2   2   7
3   3   8
4   4   9
5   5  10

Notice that the name of the columns have changed to A.1 and B.1, which were previously defined as 'A 1' and 'B 1'. When I use mB["A 1"] it returns null, whereas mB["A.1"] returns column A.1. How can I overcome this?

Thanks in advance.

Jaap Paap
  • 29
  • 4
  • See also [this question](http://stackoverflow.com/questions/3411201/specifying-column-names-in-a-data-frame-changes-spaces-to) – ROLO Sep 15 '13 at 11:56
  • 3
    `unstack` doesn't let you pass `check.names = FALSE` to it (see `utils:::unstack.data.frame` for the code), so you're somewhat stuck with R making what it considers syntactically valid names and having to manually rename them later (if you want to break the rules). – A5C1D2H2I1M1N2O1R2T1 Sep 15 '13 at 12:08

2 Answers2

1

Using column names with spaces is a mostly bad idea, but if you want to go ahead and use them anyway, here's a simple workaround. It uses setNames() to rename the columns to the names stored in unique(mY$X2).

setNames(unstack(mY, X1~X2), unique(mY$X2))
#   A 1 B 1
# 1   1   6
# 2   2   7
# 3   3   8
# 4   4   9
# 5   5  10
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
0

Out of curiosity I checked this: Unstacking with dcast keeps the "syntactically invalid names".

library(reshape2)

# need to create an id variable that is used as 'row variable', LHS in the casting formula
mY$id <- ave(mY$X2, mY$X2, FUN = seq_along)

dcast(data = mY, id ~ X2, value.var = "X1")

#   id A 1 B 1
# 1  1   1   6
# 2  2   2   7
# 3  3   3   8
# 4  4   4   9
# 5  5   5  10

@Josh O'Brien's solution is much cleaner though.

Henrik
  • 65,555
  • 14
  • 143
  • 159