3

I'm working on a Rmd which will be turned into a html report using kintr. I imported data from an xls, used clean_names() for the column names, and finished the manipulations. This is a sample of the data:

df <- data.frame(precinct = c("a_b_c", "b_c_d", "e_f_g"), steve_alpha = c(309, 337, 294), mike_bravo = c(120, 151, 240), allan_charlie = c(379, 442, 597))

Now I want to present the data aesthetically in a table using kable() but the column names and the contents of the "precinct" column need to be in title case. Is there a function that will do this all at once?

Andrew C
  • 79
  • 5
  • library(tools) colnames(df) = colnames(toTitleCase(colnames(df)) # way to change to title case toTitleCase("This is a test") – thehand0 Jan 21 '22 at 15:47
  • `snakecase::to_title_case` does this. I wrote a [similar function](https://github.com/camille-s/camiller/blob/main/R/clean_titles.R) for a package at my job. It's basically a wrapper around a couple `gsub` calls, feel free to copy it – camille Jan 21 '22 at 15:48
  • Can you clarify whether you want to replace underscores with spaces? That's what title case implies to me, but I might be wrong – camille Jan 21 '22 at 15:58

5 Answers5

4

There are a couple wrinkles here. First, I'm interpreting title case as meaning each word starts with a capital letter, and underscores are replaced by spaces. Second, some solutions will work on the data frame's names but not on the precinct column, because tools::toTitleCase, which underlies some other functions including the snakecase that I initially suggested, assumes that a single letter shouldn't be capitalized.

snakecase::to_title_case(names(df))
#> [1] "Precinct"      "Steve Alpha"   "Mike Bravo"    "Allan Charlie"
snakecase::to_title_case(df$precinct, sep_out = " ", sep_in = "_")
#> [1] "A b c" "B c d" "E f g"

That seems like not the correct outcome for precincts. Knowing that every precinct has only single-letter words, you could just replace the underscores and then convert to all caps, but that won't hold for any other words. Alternatively, stringr::str_to_title doesn't keep single-letter words lowercase, so do the replacement and then pass to that.

stringr::str_to_title(stringr::str_replace_all(df$precinct, "_", " "))
#> [1] "A B C" "B C D" "E F G"

I mentioned in a comment having made a similar function for a package at work, which handles a variety of cases and which people should feel free to copy. This is a greatly pared down version that replaces underscores, then converts any lowercase letter at the start of a word with its uppercase counterpart, so it will work on both instances.

clean_titles <- function(x) {
  x <- gsub("_", " ", x)
  x <- gsub("\\b([a-z])", "\\U\\1", x, perl = TRUE)
  x
}

clean_titles(names(df))
#> [1] "Precinct"      "Steve Alpha"   "Mike Bravo"    "Allan Charlie"
clean_titles(df$precinct)
#> [1] "A B C" "B C D" "E F G"

Finally, because you have a function that does this, you can use it in both dplyr::mutate to change that one column, and in dplyr::rename_with to change all column names.

library(dplyr)

df %>%
  mutate(precinct = clean_titles(precinct)) %>%
  rename_with(clean_titles)
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1    A B C         309        120           379
#> 2    B C D         337        151           442
#> 3    E F G         294        240           597
camille
  • 16,432
  • 18
  • 38
  • 60
3

Can't this be done with pure regex, just passing perl = TRUE to gsub and using the \U modifier?

gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", names(df), perl = TRUE)
## [1] "Precinct"      "Steve_Alpha"   "Mike_Bravo"    "Allan_Charlie"

Hence:

to_title <- function(x) {
  gsub("_", " ", gsub("(^|_)([[:alpha:]])", "\\1\\U\\2", x, perl = TRUE))
}
df$precinct <- to_title(df$precinct)
names(df) <- to_title(names(df))
df
##   Precinct Steve Alpha Mike Bravo Allan Charlie
## 1    A B C         309        120           379
## 2    B C D         337        151           442
## 3    E F G         294        240           597
Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
3

There's a few ways you can do this.

There's a function in the tools package called toTitleCase. With this, and sub you can rename the columns like this:

names(df)<-tools::toTitleCase(sub("_"," ",names(df)))

df
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597

An equivalent way using the function str_to_title from the excellent stringr package:

names(df)<-stringr::str_to_title(sub("_"," ",names(df)))
df
#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597

Finally, if you are keen to use pipes and dplyr:

df <- df |>
  rename_all(~ gsub("_", " ", .)) |>
  rename_all(stringr::str_to_title)
df

#>   Precinct Steve Alpha Mike Bravo Allan Charlie
#> 1        A         309        120           379
#> 2        B         337        151           442
#> 3        C         294        240           597
Owen J
  • 43
  • 4
1

Maybe this could be another solution:

library(stringr)

names(df) <- sapply(strsplit(names(df), "_"), \(x) {
  paste0(str_to_title(x), collapse = "_")
})

df
  Precinct Steve_Alpha Mike_Bravo Allan_Charlie
1    a_b_c         309        120           379
2    b_c_d         337        151           442
3    e_f_g         294        240           597
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41
0

Option 1: gregexpr/chartr

Here's a trick using regular expressions (solely looking for a preceding _ or the beginning of the string), then using chartr to translate the characters from lower to upper case.

nms <- colnames(df)
gre <- gregexpr("(?<=_|^)[a-z]", nms, perl = TRUE)
ltrs <- regmatches(nms, gre)
regmatches(nms, gre) <- 
  lapply(ltrs, chartr, old = paste(letters, collapse = ""), new = paste(LETTERS, collapse = ""))
colnames(df) <- nms
df
#   Precinct Steve_Alpha Mike_Bravo Allan_Charlie
# 1    a_b_c         309        120           379
# 2    b_c_d         337        151           442
# 3    e_f_g         294        240           597

Formalized a little, not really stress-tested:

#' @param text character
#' @param sep character, separators that will cause the next letter to
#'   be translated to upper-case
#' @return text, updated
toSnakeCase <- function(text, sep = c("^", "_")) {
  ptn <- paste0("(?<=", paste(sep, collapse = "|"), ")[a-z]")
  gre <- gregexpr(ptn, text, perl = TRUE)
  ltrs <- regmatches(text, gre)
  regmatches(text, gre) <- 
    lapply(ltrs, chartr, old = paste(letters, collapse = ""), new = paste(LETTERS, collapse = ""))
  text  
}

toSnakeCase(colnames(df), sep = "_")
# [1] "precinct"      "steve_Alpha"   "mike_Bravo"    "allan_Charlie"
toSnakeCase(colnames(df))
# [1] "Precinct"      "Steve_Alpha"   "Mike_Bravo"    "Allan_Charlie"

Option 2: toTitleCase, modified

(Perhaps I should have led with this.)

colnames(df) <- gsub(" ", "_", tools::toTitleCase(gsub("_", " ", colnames(df))))
df
#   Precinct Steve_Alpha Mike_Bravo Allan_Charlie
# 1    a_b_c         309        120           379
# 2    b_c_d         337        151           442
# 3    e_f_g         294        240           597
r2evans
  • 141,215
  • 6
  • 77
  • 149