tidyr
tidyr
's gather
is one of the easiest, most commonly used options. You'll first need to turn your row names into a new variable id
. I like tibble
's rownames_to_column
because I tend to prefer very descriptive function names, but you can use whatever method you like:
library(tidyr)
library(tibble)
df %>%
rownames_to_column("id") %>%
gather(conditions, values, -id)
#### OUTPUT ####
id conditions values
1 1 C1 0
2 2 C1 1
3 3 C1 1
4 1 C2 1
5 2 C2 1
6 3 C2 0
7 1 C3 0
8 2 C3 0
9 3 C3 0
The first argument after the data (conditions
) tells R where to store the variable names, and the second (values
) tells R where to store the values of each former variable. The -id
simply tells R to gather everything but id
.
base R
Following your request, and building on Onyambu's excellent suggestion, here's how you might go about using base R's reshape
. You can find a good, detailed explanation of how to use reshape
here.
reshape
can be a little unintuitive and cumbersome to use, and this was the least painful method I could come up with. It requires that you prepend the name you want your column of values to have in the long format dataframe, in this case value
. You should put a .
in there too, i.e. value.C1
. You can also do it without this step, but if you read the article I linked above you'll see that using this particular naming convention can save you some heartache later, when you deal with more complex cases:
names(df) <- paste0("value.", names(df))
reshape(df, # data
direction = "long", # long or wide
varying = 1:3, # the columns that should be stacked
timevar = "condition" # name of "time" variable, basically groups
)
#### OUTPUT ####
condition value id
1.C1 C1 0 1
2.C1 C1 1 2
3.C1 C1 1 3
1.C2 C2 1 1
2.C2 C2 1 2
3.C2 C2 0 3
1.C3 C3 0 1
2.C3 C3 0 2
3.C3 C3 0 3
Apparently reshape
creates an id
variable automatically based on rows. It will also recognize id
if you have it in your dataframe already:
names(df) <- paste0("value.", names(df))
df$id <- letters[1:3] # add an `id` variable
reshape(df,
direction = "long",
varying = 1:3,
timevar = "condition"
)
#### OUTPUT ####
id condition value
a.C1 a C1 0
b.C1 b C1 1
c.C1 c C1 1
a.C2 a C2 1
b.C2 b C2 1
c.C2 c C2 0
a.C3 a C3 0
b.C3 b C3 0
c.C3 c C3 0
Another base R option (credit to Onyambu) is using cbind
and stack
. It's not as easily generalizable to more complex cases, but it's definitely possible with some tweaking. This should work with your example data without any issues (you will need to change some column names):
cbind(id = 1:nrow(df), stack(df))
#### OUTPUT ####
id values ind
1 1 0 C1
2 2 1 C1
3 3 1 C1
4 1 1 C2
5 2 1 C2
6 3 0 C2
7 1 0 C3
8 2 0 C3
9 3 0 C3
reshape2
Yet another option would be melt
from the reshape2
package. melt
is pretty simple to use, but it has been superseded by gather
(which will itself be superseded by pivot_long
at some point):
library(reshape2)
df$id <- 1:nrow(df) # add id variable
melt(df, id.vars = "id")
#### OUTPUT ####
id variable value
1 1 C1 0
2 2 C1 1
3 3 C1 1
4 1 C2 1
5 2 C2 1
6 3 C2 0
7 1 C3 0
8 2 C3 0
9 3 C3 0