0

I'm trying to create a function that will order dataframe columns according to a specific column name order that I have as a vector.

I have tried:

order_df <- function(df, order_names){
  df <- df[, order_names]
  return(df)
}

and also

order_df <- function(df) {
  df <- df[, order_names]
  return(df)
}

where order_names is something like c("A", "C", "B", "D") and A,B,C,D are column names.

They both give the mistake: Error: object 'df' not found

I essentially want it to do this: Sort columns of a dataframe by column name

But in a function.

Thank you in advance

Matt
  • 2,947
  • 1
  • 9
  • 21
VivG
  • 57
  • 6
  • What do you mean by order? Order the values or just the columns? – NelsonGon Jul 17 '19 at 16:18
  • You could just use ``setcolorder()`` – Gainz Jul 17 '19 at 16:19
  • 1
    That first function should work fine if all the arguments you supply to it exist. The error is indicating that you don't have any data frame named `df`. How are you calling this function i.e. what code is producing the error? It wouldn't be the function definition you show in the question that's causing the error (unless you're attempting to run lines within the body instead of the entire function definition) – IceCreamToucan Jul 17 '19 at 16:22
  • Your code looks fine to me. You probably just called `order_df(df, order_names)` without first defining `df`. – asachet Jul 17 '19 at 16:28
  • This is where the error is produced: `df <- df[, order_names]` – VivG Jul 17 '19 at 16:34
  • @antoine-sac How can I define the df? What I would like to do is order_df(example_df) and get the columns sorted – VivG Jul 17 '19 at 16:41
  • That should work as long as `example_df` exists. You seem confused by the concept of function. When you call `order_df(example_df)`, the body of the function is executed with `df=example_df`. – asachet Jul 17 '19 at 16:53

2 Answers2

3

With dplyr:

library(dplyr)

order_df <- function(df, order_names){
  df %>% 
    dplyr::select(order_names,everything())
}
order_df(iris, c("Species","Sepal.Length"))

Result(truncated):

        Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1       setosa          5.1         3.5          1.4         0.2
2       setosa          4.9         3.0          1.4         0.2
3       setosa          4.7         3.2          1.3         0.2
4       setosa          4.6         3.1          1.5         0.2

With base:

order_df_2 <- function(df,order_names){
  other_names <- setdiff(names(df),order_names)
  df[,c(order_names,other_names)]
}
order_df_2(iris, c("Species","Sepal.Length"))
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • the results you added look exactly like what I want but I get the error `Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "function"` when I try – VivG Jul 17 '19 at 16:42
  • What error do you get? Hard to know without your data. Add a `dput` of your data to the question. Use `dput(head(df,n))`. Choose `n` as necessary. Also please show exactly what code you ran. – NelsonGon Jul 17 '19 at 16:43
  • Please see my comment on the accepted answer. In the future, please be explicit about exactly what you want to do. – NelsonGon Jul 17 '19 at 17:13
1

First run the following code, which simply defines the function.

order_df <- function(df, order_names=c("A", "B", "C")) {
  df <- df[, order_names]
  return(df) 
}

The function can take 2 arguments but the second one is optional and will take the default value c("A", "B", "C") if not provided.

Your versions were fine as well but using a default value is more standard.

Now, all we've done is define the function order_df.

You can simply call it on an existing data frame and it will work. If you call it on df but df does not exist in the global environment, then you should not be surprised to get an error that says exactly this.

# create a data.frame
example_df <- data.frame(
  A = 1:10,
  B = 11:20,
  K = 21:30,
  C = 31:40
)

# apply order_df
order_df(example_df)
asachet
  • 6,620
  • 2
  • 30
  • 74
  • 1
    I think this is somewhat inefficient since it discards other columns. Op's definition of order seems vague anyways. They should change order to `select`. You cannot order while also discarding elements(my naive opinion I guess). – NelsonGon Jul 17 '19 at 17:10
  • 1
    @NelsonGon you're absolutely right. Just tried to keep it simple for OP. Anyway it's not clear what they're actually trying to achieve. – asachet Jul 17 '19 at 17:14