1

I have a dataframe, and I would like to sort the rows into a custom order using a vector of row numbers. However, when I try to do this, my dataframe gets converted into a vector. How can I keep it as a dataframe? Here's a simplified example, but my real data has multiple columns, each with a different variable.

library(tibble)

# Simplified dataframe with a single numerical variable (in reality there are more variables, each in a separate column)
my_df <- data.frame(name = c(paste("sample", seq(1:12), sep = "_")),
                    treatment = c(rep(1, 4),
                                  rep(2, 5),
                                  rep(1, 3))) %>%
  tibble::column_to_rownames("name")
  
# The order I want the rows to be in
sample_order <- c(1, 2, 3, 5, 4, 6, 7, 8, 9, 10, 11, 12)

# Attempt at changing the row order converts the dataframe into a vector
sorted_df <- my_df[sample_order,]
Mike
  • 921
  • 7
  • 26

1 Answers1

0

In adddion to Andre Wildberg's comment, you could also use arrange() from the dplyr package:

library(dplyr, warn = FALSE)

# Simplified dataframe with a single numerical variable (in reality there are more variables, each in a separate column)
my_df <- data.frame(name = c(paste("sample", seq(1:12), sep = "_")),
                    treatment = c(rep(1, 4),
                                  rep(2, 5),
                                  rep(1, 3))) %>%
  tibble::column_to_rownames("name")

# The order I want the rows to be in
sample_order <- c(1, 2, 3, 5, 4, 6, 7, 8, 9, 10, 11, 12)

# Base R solution (per Andre Wildberg)
sorted_df <- my_df[sample_order,, drop = FALSE]
sorted_df
#>           treatment
#> sample_1          1
#> sample_2          1
#> sample_3          1
#> sample_5          2
#> sample_4          1
#> sample_6          2
#> sample_7          2
#> sample_8          2
#> sample_9          2
#> sample_10         1
#> sample_11         1
#> sample_12         1

# Dplyr solution
sorted_df2 <- my_df %>%
  arrange(sample_order)
sorted_df2
#>           treatment
#> sample_1          1
#> sample_2          1
#> sample_3          1
#> sample_5          2
#> sample_4          1
#> sample_6          2
#> sample_7          2
#> sample_8          2
#> sample_9          2
#> sample_10         1
#> sample_11         1
#> sample_12         1

all.equal(sorted_df, sorted_df2)
#> [1] TRUE

Created on 2023-08-21 with reprex v2.0.2

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46