1

Okay, I know that fct_reorder() lets you reorder factors based on another column, but from what I can tell, you have to supply some function (mean, median, etc.) that operates on the second column for it to know how to order the columns. But what if you have another column that sorts the way you want your column to be leveled/ordered as is?

For instance, I have a column ACADEMIC_PERIOD_DESC that gives the academic period in English: "Fall 2019," "Spring 2020," etc., and I have a corresponding column, ACADEMIC_PERIOD, that is a numeric code corresponding to the academic period: "201940," "202020," etc. This is the column I want ACADEMIC_PERIOD_DESC to be leveled by.

Data

df <- structure(list(ACADEMIC_PERIOD = c("200810", "200820", "200830", 
"200840", "200910", "200920", "200930", "200940", "201010", "201020"
), ACADEMIC_PERIOD_DESC = structure(1:10, .Label = c("J-Term 2008", 
"Spring 2008", "Summer 2008", "Fall 2008", "J-Term 2009", "Spring 2009", 
"Summer 2009", "Fall 2009", "J-Term 2010", "Spring 2010", "Summer 2010", 
"Fall 2010", "J-Term 2011", "Spring 2011", "Summer 2011", "Fall 2011", 
"J-Term 2012", "Spring 2012", "Summer 2012", "Fall 2012", "J-Term 2013", 
"Spring 2013", "Summer 2013", "Fall 2013", "Spring 2014", "Summer 2014", 
"Fall 2014", "J-Term 2015", "Spring 2015", "Summer 2015", "Fall 2015", 
"J-Term 2016", "Spring 2016", "Summer 2016", "Fall 2016", "J-Term 2017", 
"Spring 2017", "Summer 2017", "Fall 2017", "J-Term 2018", "Spring 2018", 
"Summer 2018", "Fall 2018", "J-Term 2019", "Spring 2019", "Summer 2019", 
"Fall 2019", "J-Term 2020", "Spring 2020"), class = "factor")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -10L))

Should I just do the following even though it is unnecessarily applying the median?

df %>% 
    mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC, as.integer(ACADEMIC_PERIOD)))

I also know I can just use base R like this:

df$ACADEMIC_PERIOD_DESC <- reorder(df$ACADEMIC_PERIOD_DESC, df$ACADEMIC_PERIOD)

Is there a more elegant forcats/tidyverse solution? Am I just missing something?

Thanks!

talbe009
  • 81
  • 7
  • If would help if you make this [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). If you already have your data ordered or can order it easily and just want one column to follow that order, you can use `fct_inorder`. Or if you have unique values in this other column, taking the mean won't actually matter—`mean(c(5))` is 5 – camille Feb 24 '20 at 18:19
  • @camille I just added some data. Yes, to unique values so yeah the mean of a number is that number, but is there something even simpler in forcats? Like reorder this based on this column as is, because they are unique? – talbe009 Feb 24 '20 at 18:28
  • 1
    Not. clear why `fct_reorder(ACADEMIC_PERIOD_DESC, as.integer(ACADEMIC_PERIOD))` is not elegant – akrun Feb 24 '20 at 18:32
  • @akrun I guess it is elegant, just that it's taking the mean still of `ACADEMIC_PERIOD`, which I don't need it to do. – talbe009 Feb 24 '20 at 18:33
  • I'm confused... yes, `fct_reorder`, just like `base::reorder`, uses an aggregate function internally. But in both cases the default aggregate function will work just as well on unique values. So, just like your code shows, you don't need to specify the function for it to work. You want another function for this special case? (Don't worry---taking the mean or median of a single number is very efficient!) – Gregor Thomas Feb 24 '20 at 18:34
  • 1
    The `.fun` by default is `median`, you can change it to `I` – akrun Feb 24 '20 at 18:34
  • @GregorThomas I guess I was just concerned about the computation/efficiency, but nvm. Thanks, everyone! – talbe009 Feb 24 '20 at 19:37

1 Answers1

1

We can change the .fun from the default median to I i.e. to get the value as is

library(dplyr)
library(forcats)
df %>% 
   mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC, 
           as.integer(ACADEMIC_PERIOD), .fun = I))
akrun
  • 874,273
  • 37
  • 540
  • 662