Moving columns within a data.frame() without retyping

Question

Is there a method for moving a column from one position in a data.frame to the next - without typing an entirely new data.frame()

For example:

a <- b <- c <- d <- e <- f <- g <- 1:100
df <- data.frame(a,b,c,d,e,f,g)

Now let's say I wanted "g" in front of "a"

I could retype it, as

df <- data.frame(g,a,b,c,d,e,f)

But is there not a quicker way? (Imagine 1500+ columns)

@aL3xa, I recently wrote [a function](http://stackoverflow.com/a/18540144/1270695) that takes the `append` approach. — A5C1D2H2I1M1N2O1R2T1, Aug 30 '13 at 19:40
An easy `data.table` alternative: [R data.table compute new column, but insert at beginning](https://stackoverflow.com/questions/48754756/r-data-table-compute-new-column-but-insert-at-beginning) — Henrik, May 08 '19 at 10:50

Ken Williams · Answer 1 · 2017-02-26T06:21:32.327

78

The subset function has a nice select argument that gives a convenient way to select ranges of columns by name:

df <- subset(df, select=c(g,a:f))

edited Feb 26 '17 at 06:21

answered Jul 30 '10 at 16:10

Ken Williams

22,756
10
85
147

2

Note: this also works with numbers instead of colnames – Bas Mar 09 '16 at 08:21
1

not applicable to hundreds of columns, with real names – Ferroao Feb 25 '17 at 17:45
2

@Ferroao - it doesn't matter whether it's real names or these fake ones `a:g`, it works just as well if you do `subset(df, select=c(foo, bar:baz))`. In particular it doesn't depend on the names being ordered, if that's what you're concerned about. – Ken Williams Feb 26 '17 at 06:20

score 65 · Answer 2 · edited Aug 30 '13 at 23:34

65

I wrote this function recently called moveme. It's designed to work on vectors, with the intent of shuffling column orders around.

Here's the function:

moveme <- function (invec, movecommand) {
  movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], 
                                 ",|\\s+"), function(x) x[x != ""])
  movelist <- lapply(movecommand, function(x) {
    Where <- x[which(x %in% c("before", "after", "first", 
                              "last")):length(x)]
    ToMove <- setdiff(x, Where)
    list(ToMove, Where)
  })
  myVec <- invec
  for (i in seq_along(movelist)) {
    temp <- setdiff(myVec, movelist[[i]][[1]])
    A <- movelist[[i]][[2]][1]
    if (A %in% c("before", "after")) {
      ba <- movelist[[i]][[2]][2]
      if (A == "before") {
        after <- match(ba, temp) - 1
      }
      else if (A == "after") {
        after <- match(ba, temp)
      }
    }
    else if (A == "first") {
      after <- 0
    }
    else if (A == "last") {
      after <- length(myVec)
    }
    myVec <- append(temp, values = movelist[[i]][[1]], after = after)
  }
  myVec
}

Usage is simple. Try these out:

moveme(names(df), "g first")
moveme(names(df), "g first; a last; e before c")

Of course, using it to reorder the columns in your data.frame is straightforward:

df[moveme(names(df), "g first")]

And for data.tables (moves by reference, no copy) :

setcolorder(dt, moveme(names(dt), "g first"))

The basic options are:

first
last
before
after

Compounded moves are separated by a semicolon.

edited Aug 30 '13 at 23:34

Matt Dowle

58,872
22
166
224

answered Aug 30 '13 at 19:25

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

8

this function is really useful! Why not include it in data.table? – Wet Feet Jan 17 '14 at 08:05
@WetFeet, thanks! I enjoyed writing the Function. Not sure if Matthew Dowle sees an actual need for it though :-) – A5C1D2H2I1M1N2O1R2T1 Jan 17 '14 at 10:53
excellent, that's going in my .Rprofile. +1 for adding to data.table. Stata has a similar capabality – JasonB Feb 26 '14 at 17:34
it would be interesting to make it work when column names have spaces "column name" – Ferroao Feb 27 '17 at 16:41
@Ferroao That would be a good modification. Not sure just yet how to implement it reliably. – A5C1D2H2I1M1N2O1R2T1 Feb 28 '17 at 00:23
maybe the user can put \`col name\` – Ferroao Feb 28 '17 at 14:31
I use this function all the time. Massively helpful. +1 – Will Phillips Mar 29 '17 at 13:09
it is very slow on data frames of many rows though. Nearly 28 seconds on a data frame of 8 columns, 1.6 milion rows – Mark Feb 15 '18 at 18:12
3

@Mark, sorry--cannot reproduce, and there's no reason that it should depend on the size of the `data.frame` -- only on the number of moves required. Perhaps you're mixing up the time it takes to *print* a large `data.frame` to the console rather than the amount of time it takes to reorder the columns. For instance, if you have a `data.frame` named "DF" and did `system.time(DF[moveme(names(DF), "V8 before V5; V3 last")])` that should take no time, but `system.time(print(DF[moveme(names(DF), "V8 before V5; V3 last")]))` would definitely rack up at least a few seconds (7 seconds on my system). – A5C1D2H2I1M1N2O1R2T1 Feb 16 '18 at 06:27

score 61 · Accepted Answer · answered Jul 30 '10 at 09:14

61

Here is one way to do it:

> col_idx <- grep("g", names(df))
> df <- df[, c(col_idx, (1:ncol(df))[-col_idx])]
> names(df)
[1] "g" "a" "b" "c" "d" "e" "f"

answered Jul 30 '10 at 09:14

rcs

67,191
22
172
153

22

I think this is the same basic idea: `df[,c("g",setdiff(names(df),"g"))]` – Frank Aug 31 '13 at 00:17
5

How could I use this to move "G" to any position I wanted? Perhaps I wanted "G" to be the 2nd column or the 4th? – JohnN Sep 18 '15 at 16:19
2

@david probably be easier using Ken's solution. Something like `subset(df, select=c(a:b,g,c:f))` – Brandon Bertelsen May 02 '16 at 01:30

Sam Firke · Answer 4 · 2015-06-23T20:52:18.510

59

Use select from the dplyr package and its everything() function to move specific columns to the start or end of a data.frame.

Move to the beginning:

library(dplyr)
df %>%
  select(g, everything())

Move to the end:

df %>%
  select(-a, everything())

Or without the %>% pipe operator, those would be select(df, g, everything()) and select(df, -a, everything()) respectively.

edited Jun 23 '15 at 20:52

answered May 27 '15 at 01:47

Sam Firke

21,571
9
87
105

`df %>% select(-a, a)` also works to move column `a` to the end – jtr13 Oct 12 '19 at 16:56
Does it offer the option to move to nth place? Ty. – Omar Gonzales May 27 '20 at 19:00

score 14 · Answer 5 · answered Aug 27 '14 at 23:37

14

Here is my solution

df[c(7,1:6)]

or you can also reorder by column name:

df[c("g",names(df)[-7])]

answered Aug 27 '14 at 23:37

jpmarindiaz

1,599
1
13
21

score 7 · Answer 6 · answered Apr 27 '20 at 19:16

Use relocate from dplyr package

mtcars %>% 
   # dplyr::relocate(disp) %>% ## simply make disp the first column
   relocate(starts_with("c"), .after = disp)  %>% ## more complex column order shuffling
   head(3)

Note, that the function was added with version 1.0, see https://www.tidyverse.org/blog/2020/03/dplyr-1-0-0-select-rename-relocate/

Sayak · Answer 7 · 2017-11-09T16:54:17.307

4

This is a very old post , but I developed this code which dynamically changes column position within a dataframe. Just change the value of n and Column Name ("g" here) and get dataframe with new column arrangements.

df1 = subset(df, select = c(head(names(df),n=3),"g", names(df) [! names(df) %in% c(head(names(df),n=3),"g")]))

edited Nov 09 '17 at 16:54

answered Nov 09 '17 at 16:48

Sayak

183
1
11

score 3 · Answer 8 · answered Oct 30 '15 at 21:49

3

This is slightly more elegant and allows to arrange first few leftmost columns and leave the rest unarranged to the right.

ordered_columns_leftside=c('var10','var34','var8')
df=df[c(ordered_columns_leftside, setdiff(names(df),ordered_columns_leftside))]

answered Oct 30 '15 at 21:49

Sergey

31
2

Of all the many alternatives provided to move columns without having to name all columns (and thus useful for moving only the last column in a dataframe with doznes of colums) this is the only one that worked for me. – Krug May 05 '16 at 01:26

Insilico · Answer 9 · 2016-07-18T17:41:07.260

Here's a similar way I used to move 'n'th column to 2nd position in a huge data frame based on the column name.

Move a column to first position:

## Move a column with name "col_name"  to first column 
colX <- grep("^col_name", colnames(df.original)) 
# get the column position from name 

df.reordered.1 <- df.original[,c(colX,1:(colX-1), (colX+1):length(df.original))]  
# get new reordered data.frame
# if the column is the last one, error "undefined columns selected" will show up. Then do the following command instead of this

df.reordered.1 <- df.original[,c(colX,1:(colX-1)]  
# get new reordered data.frame, if the column is the last one

From anywhere to To 'n'th position

## Move a column with name "col_name"  to column position "n", 
## where n > 1 (in a data.frame "df.original")

colX <- grep("^col_name", colnames(df.original)) 
# get the column position from name 

n <- 2 
# give the new expected column position (change to the position you need) 

df.reordered.2 <- df.original[,c(1:(n-1), colX, n:(colX-1), (colX+1):length(df.original))] 
# get new reordered data.frame

## Optional; to replace the original data frame with sorted data.frame 
## if the sorting looks good
df.original <- df.reordered.2
rm(df.reordered.2) # remove df

not working from several possibilities, example, colname= b n=5 — Ferroao, Feb 25 '17 at 17:42

LC-datascientist · Answer 10 · 2018-01-29T16:31:59.970

Here is a simple but flexible function I wrote to move a column anywhere in a data frame.

move.col <- function(df, move_this, next_to_this, before = FALSE) {
  if (before==FALSE)
    df[,c(match(setdiff(names(df)[1:which(names(df)==next_to_this)],move_this),names(df)),
          match(move_this,names(df)),
          match(setdiff(names(df)[which(names(df)==next_to_this):ncol(df)],c(next_to_this,move_this)),names(df)))]
  else
    df[,c(match(setdiff(names(df)[1:(which(names(df)==next_to_this))],c(next_to_this,move_this)),names(df)),
          match(move_this,names(df)),
          match(setdiff(names(df)[(which(names(df)==next_to_this)):ncol(df)],move_this),names(df)))]
}

Usage: Specify the data frame (df), the column name you want to move (move_this), and the column name of which you want to move beside (next_to_this). By default, the function will move the move_this column after the next_to_this column. You can specify before = TRUE to move move_this before next_to_this.

Examples:

Move "b" after "g" (i.e., make "b" last column).

move.col(df, "b", "g")

Move "c" after "e".

move.col(df, "c", "e")

Move "g" before "a" (i.e., make "g" first column).

move.col(df, "g", "a", before=TRUE)

Move "d" and "f" before "b" (i.e., move multiple columns).

move.col(df,c("d","f"),"b", before=TRUE)

score 1 · Answer 11 · answered Mar 27 '18 at 21:19

1

Most solutions seem overly verbose or lack encapsulation. Here's another way to solve the problem

push_left <- function(df, pushColNames){
    df[, c(pushColNames, setdiff(names(df), pushColNames))]
}

push_left(iris, c("Species", "Sepal.Length"))

answered Mar 27 '18 at 21:19

Holger Brandl

10,634
3
64
63

This one gets my vote because it will maintain labels – Patrick Apr 27 '20 at 14:14

score 1 · Answer 12 · answered Aug 28 '18 at 18:18

I found a pretty simple way of doing this that suited my needs and doesn't take much time.

You have the following column names: "a", "b", "c", "d", "e", "f", "g", "h", "i", "j"

Move "d" to second position (after "a"):

attach(df)

df <- cbind(a, d, df[,c(2:3,5:10)])

Move "j" to 4th position (after "c"):

df <- cbind(df[,c(1:3)], j, df[,c(4:9)])

score 1 · Answer 13 · answered May 14 '20 at 02:26

For data.table users :

Use setcolorder().

setDT(df) # convert into data.table

setcolorder(df,'g') # column g comes first if not all column names are mentioned
setcolorder(df, 7)  # same as above

df

       g   a   b   c   d   e   f
  1:   1   1   1   1   1   1   1
  2:   2   2   2   2   2   2   2
  3:   3   3   3   3   3   3   3
  4:   4   4   4   4   4   4   4
  5:   5   5   5   5   5   5   5
  6:   6   6   6   6   6   6   6
  7:   7   7   7   7   7   7   7

In case when column 'a' and 'b' should be moved to rightmost:

setcolorder(df,3:7) 
df

       c   d   e   f   g   a   b
  1:   1   1   1   1   1   1   1
  2:   2   2   2   2   2   2   2
  3:   3   3   3   3   3   3   3
  4:   4   4   4   4   4   4   4
  5:   5   5   5   5   5   5   5
  6:   6   6   6   6   6   6   6
  7:   7   7   7   7   7   7   7

score 0 · Answer 14 · answered Dec 23 '15 at 15:04

If the reordering is a shift, as in your example, you can use the shift function from the taRifx package. It acts on vectors, hence apply it to the column names:

> a <- b <- c <- d <- e <- f <- g <- 1:5
> df <- data.frame(a,b,c,d,e,f,g)
> df[, taRifx::shift(seq_along(df),-1)]
  g a b c d e f
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5

In fact the shift function can also be applied to a data frame, but not as expected. You can write a function:

> shift_df <- function(df, n) df[, taRifx::shift(seq_along(df),n)]
> shift_df(df, -1)
  g a b c d e f
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5
> shift_df(df, 2)
  c d e f g a b
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4
5 5 5 5 5 5 5 5

Ferroao · Answer 15 · 2019-06-16T14:23:26.960

I would like to contribute another universal working approach, similar to the previous answers of rcs, Manuel and Scott Kaiser, which only work in specific cases:

move<-function(new.pos,nameofcolumn,dfname) {
  col_idx <- grep(nameofcolumn, names(dfname))
  if (length(col_idx)==0){print("invalid column name");return(dfname)} else {
  if(new.pos>ncol(dfname)){print("invalid column number");return(dfname)} else {
  if (new.pos==1) {
    b<-dfname[ , c( col_idx, c((new.pos):ncol(dfname))[-(abs(new.pos-1-col_idx))] )]  
    }
  else if(col_idx==1 & new.pos==ncol(dfname)){
    b<-dfname[ , c((1:(new.pos-1)+1), col_idx )] 
    }
  else if(col_idx==1){
    b<-dfname[ , c((1:(new.pos-1)+1), col_idx, c((new.pos+1):ncol(dfname)) )] 
    }
  else if(new.pos==ncol(dfname)){
    b<-dfname[ , c((1:(new.pos))[-col_idx], col_idx)] 
    }
  else if(new.pos>col_idx){
    b<-dfname[ , c((1:(new.pos))[-col_idx], col_idx, c((new.pos+1):ncol(dfname)) )] 
    } 
  else{
    b<-dfname[ , c((1:(new.pos-1)), col_idx, c((new.pos):ncol(dfname))[-(abs(new.pos-1-col_idx))] )]
    }
  return(b)
  if(length(ncol(b))!=length(ncol(dfname))){print("error")}
  }
}}

Usage:

a <- b <- c <- d <- e <- f <- g <- 1:5
df <- data.frame(a,b,c,d,e,f,g)
move(1,"g",df)

The `<<-` is unnecessary. Indentation and more spaces would make your code much more readable. — Gregor Thomas, Feb 22 '17 at 19:09

Dimitris Patikas · Answer 16 · 2018-12-31T11:34:15.477

Here is one function that might help

df: the dataframe
ColName: the name of the column(s) to be moved
Position: the column number that you want the moved column to appear

moveCol <- function(df,ColName,Position=1) {
    D <- dim(df)[2]
    DFnames <- names(df)
    if (Position>D+1 | Position<1) {
        warning(paste0('Column position ',sprintf('%d',Position), ' is out of range [1-',sprintf('%d',D),']'))
        return()
    }
    for (i in ColName) {
        x <- i==DFnames
        if (all(!x)) {
            warning(paste0('Column \"', i, '\" not found'))
        } else {
            D1 <- seq(D)
            D1[x] = Position - 0.5
            df<- df[order(D1)]
        }
    }
    return(df)
}

in several cases, +1 has to be added to "Position" to work as expected — Ferroao, Feb 23 '17 at 00:52

score -2 · Answer 17 · answered Mar 17 '16 at 22:07

-2

@David asked how to move "G" to an arbitrary position, such as 4. Building on @rcs answer,

new.pos <- 4
col_idx <- grep("g", names(df))
df      <- df[ , c((1:new.pos)[-col_idx], col_idx, c((new.pos):ncol(df))[-col_idx])]

answered Mar 17 '16 at 22:07

Scott Kaiser

307
4
11

Doesn't work... try with `df = mtcars` moving the `"cyl"` column to position 4: You get two copies of the `hp` column and the `drat` column is dropped. – Gregor Thomas Feb 22 '17 at 18:57
I think you can make it work if instead of using `[-col_idx]` you use `setdiff(1:new.pos, col_idx)` (and similarly for the other end of the sequence). You also need the tail of the sequence to start at `new.pos + 1` otherwise the column originally at `new.pos` is duplicated. – Gregor Thomas Feb 22 '17 at 19:00

Moving columns within a data.frame() without retyping

17 Answers17

Linked

Related