This is one of few cases where I feel factor()
is really useful:
lvls <- c("Retained to Midyear Year 1", "Retained to Start of Year 2",
"Retained to Midyear Year 2", "Completed Degree in 1 Year",
"Completed Degree in 2 Years")
DT$retention_completion_variable_name <-
factor(DT$retention_completion_variable_name, levels = lvls)
DT <- DT[order(DT$retention_completion_variable_name), ]
DT
retention_completion_variable_name retention_completion_value
2 Retained to Midyear Year 1 0
5 Retained to Start of Year 2 1
4 Retained to Midyear Year 2 1
1 Completed Degree in 1 Year 0
3 Completed Degree in 2 Years 0
Data
DT <- as.data.frame(readr::read_table(
"retention_completion_variable_name retention_completion_value
Completed Degree in 1 Year 0
Retained to Midyear Year 1 0
Completed Degree in 2 Years 0
Retained to Midyear Year 2 1
Retained to Start of Year 2 1 "
))
Enhancement
In case there are many years to cover, the creation of the factor levels by hand would be quite cumbersome and error-prone. However, this can be automated as well by observing three rules
- All "Retained" come before all "Completed".
- Within Retained it's ordered by year and within the year by "Start" and "Midyear".
- Within "Completed" it's ordered by year.
These rules can be used to create the factor levels programmatically:
n_years <- 5L
lvls <- c(paste(c("Retained to Start of Year", "Retained to Midyear Year"),
rep(seq_len(n_years), each = 2L)),
sprintf("Completed Degree in %i Years", seq_len(n_years)))
lvls
[1] "Retained to Start of Year 1" "Retained to Midyear Year 1" "Retained to Start of Year 2"
[4] "Retained to Midyear Year 2" "Retained to Start of Year 3" "Retained to Midyear Year 3"
[7] "Retained to Start of Year 4" "Retained to Midyear Year 4" "Retained to Start of Year 5"
[10] "Retained to Midyear Year 5" "Completed Degree in 1 Years" "Completed Degree in 2 Years"
[13] "Completed Degree in 3 Years" "Completed Degree in 4 Years" "Completed Degree in 5 Years"