75

Is there a method for moving a column from one position in a data.frame to the next - without typing an entirely new data.frame()

For example:

a <- b <- c <- d <- e <- f <- g <- 1:100
df <- data.frame(a,b,c,d,e,f,g)

Now let's say I wanted "g" in front of "a"

I could retype it, as

df <- data.frame(g,a,b,c,d,e,f)

But is there not a quicker way? (Imagine 1500+ columns)

Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255
  • `append` function can be helpful... – aL3xa Dec 02 '10 at 19:00
  • @aL3xa, I recently wrote [a function](http://stackoverflow.com/a/18540144/1270695) that takes the `append` approach. – A5C1D2H2I1M1N2O1R2T1 Aug 30 '13 at 19:40
  • An easy `data.table` alternative: [R data.table compute new column, but insert at beginning](https://stackoverflow.com/questions/48754756/r-data-table-compute-new-column-but-insert-at-beginning) – Henrik May 08 '19 at 10:50

17 Answers17

78

The subset function has a nice select argument that gives a convenient way to select ranges of columns by name:

df <- subset(df, select=c(g,a:f))
Ken Williams
  • 22,756
  • 10
  • 85
  • 147
  • 2
    Note: this also works with numbers instead of colnames – Bas Mar 09 '16 at 08:21
  • 1
    not applicable to hundreds of columns, with real names – Ferroao Feb 25 '17 at 17:45
  • 2
    @Ferroao - it doesn't matter whether it's real names or these fake ones `a:g`, it works just as well if you do `subset(df, select=c(foo, bar:baz))`. In particular it doesn't depend on the names being ordered, if that's what you're concerned about. – Ken Williams Feb 26 '17 at 06:20
65

I wrote this function recently called moveme. It's designed to work on vectors, with the intent of shuffling column orders around.

Here's the function:

moveme <- function (invec, movecommand) {
  movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], 
                                 ",|\\s+"), function(x) x[x != ""])
  movelist <- lapply(movecommand, function(x) {
    Where <- x[which(x %in% c("before", "after", "first", 
                              "last")):length(x)]
    ToMove <- setdiff(x, Where)
    list(ToMove, Where)
  })
  myVec <- invec
  for (i in seq_along(movelist)) {
    temp <- setdiff(myVec, movelist[[i]][[1]])
    A <- movelist[[i]][[2]][1]
    if (A %in% c("before", "after")) {
      ba <- movelist[[i]][[2]][2]
      if (A == "before") {
        after <- match(ba, temp) - 1
      }
      else if (A == "after") {
        after <- match(ba, temp)
      }
    }
    else if (A == "first") {
      after <- 0
    }
    else if (A == "last") {
      after <- length(myVec)
    }
    myVec <- append(temp, values = movelist[[i]][[1]], after = after)
  }
  myVec
}

Usage is simple. Try these out:

moveme(names(df), "g first")
moveme(names(df), "g first; a last; e before c")

Of course, using it to reorder the columns in your data.frame is straightforward:

df[moveme(names(df), "g first")]

And for data.tables (moves by reference, no copy) :

setcolorder(dt, moveme(names(dt), "g first"))

The basic options are:

  • first
  • last
  • before
  • after

Compounded moves are separated by a semicolon.

Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • 8
    this function is really useful! Why not include it in data.table? – Wet Feet Jan 17 '14 at 08:05
  • @WetFeet, thanks! I enjoyed writing the Function. Not sure if Matthew Dowle sees an actual need for it though :-) – A5C1D2H2I1M1N2O1R2T1 Jan 17 '14 at 10:53
  • excellent, that's going in my .Rprofile. +1 for adding to data.table. Stata has a similar capabality – JasonB Feb 26 '14 at 17:34
  • it would be interesting to make it work when column names have spaces "column name" – Ferroao Feb 27 '17 at 16:41
  • @Ferroao That would be a good modification. Not sure just yet how to implement it reliably. – A5C1D2H2I1M1N2O1R2T1 Feb 28 '17 at 00:23
  • maybe the user can put \`col name\` – Ferroao Feb 28 '17 at 14:31
  • I use this function all the time. Massively helpful. +1 – Will Phillips Mar 29 '17 at 13:09
  • it is very slow on data frames of many rows though. Nearly 28 seconds on a data frame of 8 columns, 1.6 milion rows – Mark Feb 15 '18 at 18:12
  • 3
    @Mark, sorry--cannot reproduce, and there's no reason that it should depend on the size of the `data.frame` -- only on the number of moves required. Perhaps you're mixing up the time it takes to *print* a large `data.frame` to the console rather than the amount of time it takes to reorder the columns. For instance, if you have a `data.frame` named "DF" and did `system.time(DF[moveme(names(DF), "V8 before V5; V3 last")])` that should take no time, but `system.time(print(DF[moveme(names(DF), "V8 before V5; V3 last")]))` would definitely rack up at least a few seconds (7 seconds on my system). – A5C1D2H2I1M1N2O1R2T1 Feb 16 '18 at 06:27
61

Here is one way to do it:

> col_idx <- grep("g", names(df))
> df <- df[, c(col_idx, (1:ncol(df))[-col_idx])]
> names(df)
[1] "g" "a" "b" "c" "d" "e" "f"
rcs
  • 67,191
  • 22
  • 172
  • 153
59

Use select from the dplyr package and its everything() function to move specific columns to the start or end of a data.frame.

Move to the beginning:

library(dplyr)
df %>%
  select(g, everything())

Move to the end:

df %>%
  select(-a, everything())

Or without the %>% pipe operator, those would be select(df, g, everything()) and select(df, -a, everything()) respectively.

Sam Firke
  • 21,571
  • 9
  • 87
  • 105
14

Here is my solution

df[c(7,1:6)]

or you can also reorder by column name:

df[c("g",names(df)[-7])]
jpmarindiaz
  • 1,599
  • 1
  • 13
  • 21
7

Use relocate from dplyr package

mtcars %>% 
   # dplyr::relocate(disp) %>% ## simply make disp the first column
   relocate(starts_with("c"), .after = disp)  %>% ## more complex column order shuffling
   head(3)

Note, that the function was added with version 1.0, see https://www.tidyverse.org/blog/2020/03/dplyr-1-0-0-select-rename-relocate/

Holger Brandl
  • 10,634
  • 3
  • 64
  • 63
4

This is a very old post , but I developed this code which dynamically changes column position within a dataframe. Just change the value of n and Column Name ("g" here) and get dataframe with new column arrangements.

df1 = subset(df, select = c(head(names(df),n=3),"g", names(df) [! names(df) %in% c(head(names(df),n=3),"g")]))
Sayak
  • 183
  • 1
  • 11
3

This is slightly more elegant and allows to arrange first few leftmost columns and leave the rest unarranged to the right.

ordered_columns_leftside=c('var10','var34','var8')
df=df[c(ordered_columns_leftside, setdiff(names(df),ordered_columns_leftside))]
Sergey
  • 31
  • 2
  • Of all the many alternatives provided to move columns without having to name all columns (and thus useful for moving only the last column in a dataframe with doznes of colums) this is the only one that worked for me. – Krug May 05 '16 at 01:26
1

Here's a similar way I used to move 'n'th column to 2nd position in a huge data frame based on the column name.

Move a column to first position:

## Move a column with name "col_name"  to first column 
colX <- grep("^col_name", colnames(df.original)) 
# get the column position from name 

df.reordered.1 <- df.original[,c(colX,1:(colX-1), (colX+1):length(df.original))]  
# get new reordered data.frame
# if the column is the last one, error "undefined columns selected" will show up. Then do the following command instead of this

df.reordered.1 <- df.original[,c(colX,1:(colX-1)]  
# get new reordered data.frame, if the column is the last one

From anywhere to To 'n'th position

## Move a column with name "col_name"  to column position "n", 
## where n > 1 (in a data.frame "df.original")

colX <- grep("^col_name", colnames(df.original)) 
# get the column position from name 

n <- 2 
# give the new expected column position (change to the position you need) 

df.reordered.2 <- df.original[,c(1:(n-1), colX, n:(colX-1), (colX+1):length(df.original))] 
# get new reordered data.frame

## Optional; to replace the original data frame with sorted data.frame 
## if the sorting looks good
df.original <- df.reordered.2
rm(df.reordered.2) # remove df
Insilico
  • 866
  • 9
  • 10
1

Here is a simple but flexible function I wrote to move a column anywhere in a data frame.

move.col <- function(df, move_this, next_to_this, before = FALSE) {
  if (before==FALSE)
    df[,c(match(setdiff(names(df)[1:which(names(df)==next_to_this)],move_this),names(df)),
          match(move_this,names(df)),
          match(setdiff(names(df)[which(names(df)==next_to_this):ncol(df)],c(next_to_this,move_this)),names(df)))]
  else
    df[,c(match(setdiff(names(df)[1:(which(names(df)==next_to_this))],c(next_to_this,move_this)),names(df)),
          match(move_this,names(df)),
          match(setdiff(names(df)[(which(names(df)==next_to_this)):ncol(df)],move_this),names(df)))]
}

Usage: Specify the data frame (df), the column name you want to move (move_this), and the column name of which you want to move beside (next_to_this). By default, the function will move the move_this column after the next_to_this column. You can specify before = TRUE to move move_this before next_to_this.

Examples:

  1. Move "b" after "g" (i.e., make "b" last column).

move.col(df, "b", "g")

  1. Move "c" after "e".

move.col(df, "c", "e")

  1. Move "g" before "a" (i.e., make "g" first column).

move.col(df, "g", "a", before=TRUE)

  1. Move "d" and "f" before "b" (i.e., move multiple columns).

move.col(df,c("d","f"),"b", before=TRUE)

LC-datascientist
  • 1,960
  • 1
  • 18
  • 32
1

Most solutions seem overly verbose or lack encapsulation. Here's another way to solve the problem

push_left <- function(df, pushColNames){
    df[, c(pushColNames, setdiff(names(df), pushColNames))]
}

push_left(iris, c("Species", "Sepal.Length"))
Holger Brandl
  • 10,634
  • 3
  • 64
  • 63
1

I found a pretty simple way of doing this that suited my needs and doesn't take much time.

You have the following column names: "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"

Move "d" to second position (after "a"):

attach(df)

df <- cbind(a, d, df[,c(2:3,5:10)])

Move "j" to 4th position (after "c"):

df <- cbind(df[,c(1:3)], j, df[,c(4:9)])
1

For data.table users :

Use setcolorder().

setDT(df) # convert into data.table

setcolorder(df,'g') # column g comes first if not all column names are mentioned
setcolorder(df, 7)  # same as above

df

       g   a   b   c   d   e   f
  1:   1   1   1   1   1   1   1
  2:   2   2   2   2   2   2   2
  3:   3   3   3   3   3   3   3
  4:   4   4   4   4   4   4   4
  5:   5   5   5   5   5   5   5
  6:   6   6   6   6   6   6   6
  7:   7   7   7   7   7   7   7

In case when column 'a' and 'b' should be moved to rightmost:

setcolorder(df,3:7) 
df

       c   d   e   f   g   a   b
  1:   1   1   1   1   1   1   1
  2:   2   2   2   2   2   2   2
  3:   3   3   3   3   3   3   3
  4:   4   4   4   4   4   4   4
  5:   5   5   5   5   5   5   5
  6:   6   6   6   6   6   6   6
  7:   7   7   7   7   7   7   7
Matthew Son
  • 1,109
  • 8
  • 27
0

If the reordering is a shift, as in your example, you can use the shift function from the taRifx package. It acts on vectors, hence apply it to the column names:

> a <- b <- c <- d <- e <- f <- g <- 1:5
> df <- data.frame(a,b,c,d,e,f,g)
> df[, taRifx::shift(seq_along(df),-1)]
  g a b c d e f
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5

In fact the shift function can also be applied to a data frame, but not as expected. You can write a function:

> shift_df <- function(df, n) df[, taRifx::shift(seq_along(df),n)]
> shift_df(df, -1)
  g a b c d e f
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5
> shift_df(df, 2)
  c d e f g a b
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5
Stéphane Laurent
  • 75,186
  • 15
  • 119
  • 225
0

I would like to contribute another universal working approach, similar to the previous answers of rcs, Manuel and Scott Kaiser, which only work in specific cases:

move<-function(new.pos,nameofcolumn,dfname) {
  col_idx <- grep(nameofcolumn, names(dfname))
  if (length(col_idx)==0){print("invalid column name");return(dfname)} else {
  if(new.pos>ncol(dfname)){print("invalid column number");return(dfname)} else {
  if (new.pos==1) {
    b<-dfname[ , c( col_idx, c((new.pos):ncol(dfname))[-(abs(new.pos-1-col_idx))] )]  
    }
  else if(col_idx==1 & new.pos==ncol(dfname)){
    b<-dfname[ , c((1:(new.pos-1)+1), col_idx )] 
    }
  else if(col_idx==1){
    b<-dfname[ , c((1:(new.pos-1)+1), col_idx, c((new.pos+1):ncol(dfname)) )] 
    }
  else if(new.pos==ncol(dfname)){
    b<-dfname[ , c((1:(new.pos))[-col_idx], col_idx)] 
    }
  else if(new.pos>col_idx){
    b<-dfname[ , c((1:(new.pos))[-col_idx], col_idx, c((new.pos+1):ncol(dfname)) )] 
    } 
  else{
    b<-dfname[ , c((1:(new.pos-1)), col_idx, c((new.pos):ncol(dfname))[-(abs(new.pos-1-col_idx))] )]
    }
  return(b)
  if(length(ncol(b))!=length(ncol(dfname))){print("error")}
  }
}}

Usage:

a <- b <- c <- d <- e <- f <- g <- 1:5
df <- data.frame(a,b,c,d,e,f,g)
move(1,"g",df)
Ferroao
  • 3,042
  • 28
  • 53
-1

Here is one function that might help

  • df: the dataframe
  • ColName: the name of the column(s) to be moved
  • Position: the column number that you want the moved column to appear

moveCol <- function(df,ColName,Position=1) {
    D <- dim(df)[2]
    DFnames <- names(df)
    if (Position>D+1 | Position<1) {
        warning(paste0('Column position ',sprintf('%d',Position), ' is out of range [1-',sprintf('%d',D),']'))
        return()
    }
    for (i in ColName) {
        x <- i==DFnames
        if (all(!x)) {
            warning(paste0('Column \"', i, '\" not found'))
        } else {
            D1 <- seq(D)
            D1[x] = Position - 0.5
            df<- df[order(D1)]
        }
    }
    return(df)
}
-2

@David asked how to move "G" to an arbitrary position, such as 4. Building on @rcs answer,

new.pos <- 4
col_idx <- grep("g", names(df))
df      <- df[ , c((1:new.pos)[-col_idx], col_idx, c((new.pos):ncol(df))[-col_idx])]
Scott Kaiser
  • 307
  • 4
  • 11
  • Doesn't work... try with `df = mtcars` moving the `"cyl"` column to position 4: You get two copies of the `hp` column and the `drat` column is dropped. – Gregor Thomas Feb 22 '17 at 18:57
  • I think you can make it work if instead of using `[-col_idx]` you use `setdiff(1:new.pos, col_idx)` (and similarly for the other end of the sequence). You also need the tail of the sequence to start at `new.pos + 1` otherwise the column originally at `new.pos` is duplicated. – Gregor Thomas Feb 22 '17 at 19:00