4

How do I specify the column order, based on column 'Col' when using dcast?

df <- dcast(x, ID ~ ColumnName, value.var = "Answer")

I need the solution to be non specific to the data as x can be the results of any question (thus Col can be 1-3 or 1-2 etc.). Two dummy examples below of x.

ID        Answer        ColumnName        Col
1         Anduin        First Name        1
1         Wrynn         Surname           2
1         Alliance      Faction           3
2         Sylvanas      First Name        1
2         Windrunner    Surname           2
2         Horde         Faction           3



ID        Answer        ColumnName        Col
1         The Kirin Tor Quest             1
1         90            Level             2
2         Emissary      Quest             1
2         38            Level             2
shujufenxishi
  • 53
  • 1
  • 5

3 Answers3

13

It is not well documented, neither in help("dcast", "reshape2") nor in help("dcast", "data.table") but the columns are ordered by factor levels if ColumnName is a factor.

Without factor()

The columns are ordered alphabetically which is not in line with the order specified in Col

reshape2::dcast(x, ID ~ ColumnName, value.var = "Answer")
  ID  Faction First Name    Surname
1  1 Alliance     Anduin      Wrynn
2  2    Horde   Sylvanas Windrunner

With factor levels ordered by appearance

Apparently, the OP is using Col to number ColumnName in order of appearance. Thus, Col is redundant and can be ignored.

There are several ways to create factor levels ordered by appearance directly in the formula of dcast():

Base R

reshape2::dcast(x1, ID ~ factor(ColumnName, levels = unique(ColumnName)), value.var = "Answer")
  ID First Name    Surname  Faction
1  1     Anduin      Wrynn Alliance
2  2   Sylvanas Windrunner    Horde
reshape2::dcast(x2, ID ~ factor(ColumnName, levels = unique(ColumnName)), value.var = "Answer")
  ID         Quest Level
1  1 The Kirin Tor    90
2  2      Emissary    38

As requested by the OP, the solution is non specific to the data.

forcats

The forcats package has some handy functions to deal with factors. as_factor() creates levels in the order in which they appear:

reshape2::dcast(x1, ID ~ forcats::as_factor(ColumnName), value.var = "Answer")

(The result is the same as above.)

To make the code more explicit, fct_inorder() can be used:

reshape2::dcast(x1, ID ~ forcats::fct_inorder(ColumnName), value.var = "Answer")

With factor levels ordered by Col

In case the requested column order is different from the order of appearance, the columns can be ordered by factor levels which are derived from Col. For demonstration, a modified sample dataset is used.

With base R,

reshape2::dcast(x3, ID ~ reorder(factor(ColumnName), Col), value.var = "Answer")
  ID    Surname  Faction First Name
1  1      Wrynn Alliance     Anduin
2  2 Windrunner    Horde   Sylvanas

or with forcats

reshape2::dcast(x3, ID ~ forcats::fct_reorder(ColumnName, Col), value.var = "Answer")

Data

x1 <- readr::read_table(
"ID        Answer        ColumnName        Col
1         Anduin        First Name        1
1         Wrynn         Surname           2
1         Alliance      Faction           3
2         Sylvanas      First Name        1
2         Windrunner    Surname           2
2         Horde         Faction           3")

x2 <- data.table::fread(
'ID        Answer           ColumnName        Col 
1         "The Kirin Tor"  Quest             1 
1         90               Level             2 
2         Emissary         Quest             1 
2         38               Level             2')

Modified sample dataset with a given column order (other than order of appearance or alphabetically order):

x3 <- readr::read_table(
"ID        Answer        ColumnName        Col
1         Anduin        First Name        3
1         Wrynn         Surname           1
1         Alliance      Faction           2
2         Sylvanas      First Name        3
2         Windrunner    Surname           1
2         Horde         Faction           2")
Uwe
  • 41,420
  • 11
  • 90
  • 134
0

I am not seeing a dcast argument that allows to specify the final order of columns but it is always possible to subset the result.

fun_dcast <- function(DF, formula = ID ~ ColumnName, value.var = "Answer"){
  reshape2::dcast(DF, formula, value.var = value.var)[c(1, 1 + unique(DF[["Col"]]))]
}

fun_dcast(x1)
#  ID  Faction First Name    Surname
#1  1 Alliance     Anduin      Wrynn
#2  2    Horde   Sylvanas Windrunner

fun_dcast(x2)
#  ID Level         Quest
#1  1    90 The Kirin Tor
#2  2    38      Emissary

Data.

x1 <- read.table(text = "
ID        Answer        ColumnName        Col
1         Anduin        'First Name'        1
1         Wrynn         Surname           2
1         Alliance      Faction           3
2         Sylvanas      'First Name'        1
2         Windrunner    Surname           2
2         Horde         Faction           3
", header = TRUE)


x2 <- read.table(text = "
ID        Answer        ColumnName        Col
1         'The Kirin Tor' Quest             1
1         90            Level             2
2         Emissary      Quest             1
2         38            Level             2
", header = TRUE)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

I know this is ancient, but I think the following solution is simple. It does require you to provide a vector of all possible column names that might be encountered, though (i.e. desiredOrder).

thisdf <- data.frame(c3 = c("milk", "cheese", "toast", "gravy","spice"),
                      c2 = c(23, 41, 32, 58, 26),
                      c1 = c("Jon", "Bill", "Maria", "Ben", "Tina")
)
desiredOrder <- c("c1","c2","c3","c4")
colsPresent <- desiredOrder[desiredOrder %in% names(thisdf)]
thisdf<-thisdf[,colsPresent]
gruvn
  • 692
  • 1
  • 6
  • 25