1

I have a data frame with four columns, and want transform it to a data frame with 2 columns, but sequence matters (so stack or merge without additional sorting is not an option)

X1 Y1 X2 Y2
1  2  3  4
5  6  7  8

To

X1 Y1 
1  2
3  4
5  6
7  8

My ugly version:

nrow = 4
# Test data set
d = setNames(data.frame(matrix(sample(1:(nrow*4)), nrow=nrow)), 
             c("X1","Y1","X2","Y2"))d

# Create empty data frame
d1 = data.frame(matrix(rep(NA, nrow*2*2), nrow = nrow*2))
# Elements 1, 3, 5...
d1[seq(1, nrow*2, by = 2),]   = d[,1:2]
# Elements 2, 4, 6...
d1[seq(2, nrow*2, by = 2),]   = d[,3:4]

Not necessary base R.

Added later: I just found:

data.frame(matrix(as.vector(t(as.matrix(d))), nrow = 2*nrow, byrow = TRUE))

but looks like @akrun has a slightly simpler version of it

The alternative solution in the post mentioned by @alistaire, for example using reshape, are definitively not more elegant than my orginal version.

Dieter Menne
  • 10,076
  • 44
  • 67
  • I did not find that one, but the solutions presented there are much messier than my orginal one. I tried to avoid the keyword "merge" and used "intercalate" from my DNA-research past. – Dieter Menne Jan 02 '17 at 16:05

1 Answers1

2

If the 'X' and 'Y' columns are alternating, then , we transpose the dataset, convert it to matrix by specifying the number of columns of original dataset divided by 2 and convert to data.frame

as.data.frame(matrix(t(df1), ncol= ncol(df1)/2, byrow=TRUE,
                               dimnames = list(NULL, c("X1", "Y1"))))
#   X1 Y1
#1  1  2
#2  3  4
#3  5  6
#4  7  8

Or use melt from data.table, we can take multiple measure patterns based on the column name to convert to 'long' format

library(data.table)
melt(setDT(df1), measure = patterns("^X", "^Y"), 
       value.name = c("X1", "Y1"))[, variable := NULL][order(X1)]
#   X1 Y1
#1:  1  2
#2:  3  4
#3:  5  6
#4:  7  8
akrun
  • 874,273
  • 37
  • 540
  • 662