How do I stack only some columns in a data frame?

Question

I have some data in a data frame in the following form:

A  B  C  V1  V2  V3
1  1  1  x   y   z
1  1  2  a   b   c
...

Where A,B,C are factors, and the combination A,B,C is unique for each row.

I need to convert some of the columns into factors, to achieve a form like:

A  B  C  V  val
1  1  1  V1 x
1  1  1  V2 y
1  1  1  V3 z
1  1  2  V1 a
1  1  2  V2 b
1  1  2  V2 c
...

This seems to relate to both stack and the inverse of xtabs, but I don't see how to specify that only certain columns should be "stacked".

See `reshape` and the reshape package. – Tyler Rinker Feb 01 '13 at 02:25 — Tyler Rinker, Feb 01 '13 at 02:25

thelatemail · Answer 1 · 2013-02-01T03:12:55.610

5

And before @AnandaMahto gets here and offers his base reshape solution, here's my attempt:

dat <- read.table(text = 'A  B  C  V1  V2  V3
1  1  1  x   y   z
1  1  2  a   b   c',header= T)

expandvars <- c("V1","V2","V3")

datreshape <- reshape(dat,
                   idvar=c("A","B","C"),
                   varying=list(expandvars),
                   v.names=c("val"),
                   times=expandvars,
                   direction="long")

> datreshape
         A B C time val
1.1.1.V1 1 1 1   V1   x
1.1.2.V1 1 1 2   V1   a
1.1.1.V2 1 1 1   V2   y
1.1.2.V2 1 1 2   V2   b
1.1.1.V3 1 1 1   V3   z
1.1.2.V3 1 1 2   V3   c

edited Feb 01 '13 at 03:12

answered Feb 01 '13 at 02:55

thelatemail

91,185
12
128
188

@agstudy, it's nice to have some other shapeshifters in my company. +1 to both of you. – A5C1D2H2I1M1N2O1R2T1 Feb 01 '13 at 03:59
@agstudy and Thela (sorry, you're stuck with that name now), I can't resist [adding an answer](http://stackoverflow.com/a/14639977/1270695) to a reshape question. (I should seek help....) – A5C1D2H2I1M1N2O1R2T1 Feb 01 '13 at 04:57
+1 to all, thanks! I didn't realize there was a community of R ninjas on SO. – saffsd Feb 01 '13 at 10:15

score 4 · Accepted Answer · answered Feb 01 '13 at 02:28

Using reshape2 package

dat <- read.table(text = 'A  B  C  V1  V2  V3
1  1  1  x   y   z
1  1  2  a   b   c',header= T)
library(reshape2)
melt(dat,id.vars = c('A','B','C'))
 A B C variable value
1 1 1 1       V1     x
2 1 1 2       V1     a
3 1 1 1       V2     y
4 1 1 2       V2     b
5 1 1 1       V3     z
6 1 1 2       V3     c

score 3 · Answer 3 · edited May 23 '17 at 12:16

`stack`

You are right that stack is a possibility, but you perhaps missed a key line in the documentation for stack:

Note that stack applies to vectors (as determined by is.vector): non-vector columns (e.g., factors) will be ignored (with a warning as from R 2.15.0).

So, how do we proceed?

Here's your data:

dat <- read.table(text = 'A  B  C  V1  V2  V3
    1  1  1  x   y   z
    1  1  2  a   b   c',header= T)

Here, we convert the factors to as.character:

dat[sapply(dat, is.factor)] = lapply(dat[sapply(dat, is.factor)], as.character)

Here's how we specify which columns to stack:

stack(dat[4:6])
#   values ind
# 1      x  V1
# 2      a  V1
# 3      y  V2
# 4      b  V2
# 5      z  V3
# 6      c  V3

But, we still need to "expand" your rows for columns 1-3. See here for how to do that.

With this information, we can use cbind to get the desired result.

cbind(dat[rep(row.names(dat), 3), 1:3], stack(dat[4:6]))
#     A B C values ind
# 1   1 1 1      x  V1
# 2   1 1 2      a  V1
# 1.1 1 1 1      y  V2
# 2.1 1 1 2      b  V2
# 1.2 1 1 1      z  V3
# 2.2 1 1 2      c  V3

`xtabs`

You are also right that xtabs seems like it could be a likely possibility, but xtabs actually expects the opposite of what you've provided. That is to say, when you specify a formula, it expects the items on the left hand side to be numbers, and the items on the right hand side to be factors. Thus, is your data were swapped, you could certainly use xtabs.

Here's a demonstration (which only works because you are using a simple example where we can easily match "letters" to "numbers").

dat2 <- dat # Make a copy of "dat"
# Swap out dat 4-6 with numbers
dat2[4:6] <- lapply(dat2[4:6], function(x) match(x, letters))
# Swap out dat 1-3 with letters
dat2[1:3] <- lapply(dat2[1:3], function(x) letters[x])
# Our new "dat"
dat2
#   A B C V1 V2 V3
# 1 a a a 24 25 26
# 2 a a b  1  2  3
data.frame(xtabs(cbind(V1, V2, V3) ~ A + B + C, dat2))
#   A B C Var4 Freq
# 1 a a a   V1   24
# 2 a a b   V1    1
# 3 a a a   V2   25
# 4 a a b   V2    2
# 5 a a a   V3   26
# 6 a a b   V3    3

In other words, your choice of tools could potentially be right, but your data needs to also be in the form that the tools expect.

But, I'm not sure why you'd want to do all the work I've shown when better solutions exist with reshape and friends ;)

Very late update...

You can also look at merged.stack from my "splitstackshape" package:

library(splitstackshape)
merged.stack(dat, var.stubs = "V", sep = "NoSep")
#    A B C .time_1 V
# 1: 1 1 1      V1 x
# 2: 1 1 1      V2 y
# 3: 1 1 1      V3 z
# 4: 1 1 2      V1 a
# 5: 1 1 2      V2 b
# 6: 1 1 2      V3 c

Or gather from "tidyr":

library(dplyr)
library(tidyr)
# gather(dat, var, val, V1:V3)
dat %>% gather(var, val, V1:V3) 
#   A B C var val
# 1 1 1 1  V1   x
# 2 1 1 2  V1   a
# 3 1 1 1  V2   y
# 4 1 1 2  V2   b
# 5 1 1 1  V3   z
# 6 1 1 2  V3   c

Thank you for this reply. I'm only just starting to understand the R data model, this is the sort of reply that provides very useful insight. — saffsd, Feb 01 '13 at 10:17
@saffsd, no problem. SO is a good place to learn about these things. Do be sure to read [this post](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) though, and try to share your question in a reproducible format. — A5C1D2H2I1M1N2O1R2T1, Feb 01 '13 at 10:24
Following up from an old post, the base version can be made simpler like `cbind(dat[1:3],stack(lapply(dat[4:6],as.character)))` - no need to explicitly repeat the rows. — thelatemail, Jul 01 '16 at 05:15

How do I stack only some columns in a data frame?

3 Answers3

`stack`

`xtabs`

Very late update...

Linked

Related