199

Let's say that I have a two word string and I want to capitalize both of them.

name <- c("zip code", "state", "final count")

The Hmisc package has a function capitalize which capitalized the first word, but I'm not sure how to get the second word capitalized. The help page for capitalize doesn't suggest that it can perform that task.

library(Hmisc)
capitalize(name)
# [1] "Zip code"    "State"       "Final count"

I want to get:

c("Zip Code", "State", "Final Count")

What about three-word strings:

name2 <- c("I like pizza")
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
ATMathew
  • 12,566
  • 26
  • 69
  • 76

15 Answers15

202

There is a build-in base-R solution for title case as well:

tools::toTitleCase("demonstrating the title case")
## [1] "Demonstrating the Title Case"

or

library(tools)
toTitleCase("demonstrating the title case")
## [1] "Demonstrating the Title Case"
petermeissner
  • 12,234
  • 5
  • 63
  • 63
  • 4
    Having looked into the source a little it shows that the function tries to achieve title case (which is something else than all words start with capital letter) by letting start all words with capital letters except a collection of English most likely exceptions (like e.g. `c("all", "above", "after", "along", "also", "among", "any", "both", "can", "few", "it", "less", "log", "many", "may", "more", "over", "some", "their", "then", "this", "under", "until", "using", "von", "when", "where", "which", "will", "without", "yet", "you", "your")`) – petermeissner Jul 26 '16 at 11:45
  • 26
    You might be surprised if you expect ONLY the initial character to be capitalized. `tools::toTitleCase("HELLO")` results in `HELLO`. You might want to wrap this around `tolower` first, as so: `tools::toTitleCase(tolower("HELLO"))` which returns `Hello` – ddunn801 Apr 05 '17 at 15:56
  • 2
    good ppint - still its the title-case-ishst you can get so far – petermeissner Apr 05 '17 at 20:12
  • Thanks! This solution works great for most cases except when there are abbreviations of U.S states – Tung Feb 13 '18 at 08:24
  • This can work alright when needed to be called a small number of times, but using stringr::str_to_title sped my code up by a factor of about 15 versus tools::toTitleCase. – Max Candocia Sep 01 '20 at 05:22
  • 1
    `stringr::str_to_title("demonstrating the title case")` --> `# "Demonstrating The Title Case"` which is not propper title case but an upper case for each word's first character. But sure if it does not matter but speed does you can use a stringr/stringi solution like the one already present: https://stackoverflow.com/a/22401159/1144966 – petermeissner Sep 01 '20 at 12:57
184

The base R function to perform capitalization is toupper(x). From the help file for ?toupper there is this function that does what you need:

simpleCap <- function(x) {
  s <- strsplit(x, " ")[[1]]
  paste(toupper(substring(s, 1,1)), substring(s, 2),
      sep="", collapse=" ")
}

name <- c("zip code", "state", "final count")

sapply(name, simpleCap)

     zip code         state   final count 
   "Zip Code"       "State" "Final Count" 

Edit This works for any string, regardless of word count:

simpleCap("I like pizza a lot")
[1] "I Like Pizza A Lot"
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • 12
    And if this is helpful to other, remember by putting the tolower function inside thee simpleCap function you can deal with all capped words too:is code you can deal:
    name <- c("george wasHINgton","tom jefferson", "ABE LINCOLN") simpleCap <- function(x) { s <- tolower(x) s <- strsplit(s, " ")[[1]] paste(toupper(substring(s, 1,1)), substring(s, 2), sep="", collapse=" ") } sapply(name, simpleCap)
    – MatthewR Sep 03 '14 at 18:22
  • How about hyphenated names? Like Smith-Jones or Al-Rayon, which could be entered as SMITH-JONES or al-rayon. – Hack-R Jan 05 '15 at 14:44
  • 1
    You can use `paste0()` instead of `paste(..., sep="")`. Simply shorter. – MERose Aug 06 '15 at 23:37
  • 3
    @merose Correct, but not in this case, since `paste0 ()` doesn't accept the `collapse = ...` argument – Andrie Aug 07 '15 at 05:36
  • 3
    @Andrie is that still correct? `paste0(c("a", "b"), collapse = ",")` works fine for me. Perhaps this is a recent feature? – MichaelChirico Oct 06 '16 at 03:13
  • @MichaelChirico Good to know. – Andrie Oct 11 '16 at 20:37
  • How can we apply this on a dataframe with a text variable? I tried `df %>% + mutate(V1=simpleCap(V1))` but it imputes the whole variable with the first entry. – HNSKD Jun 03 '17 at 10:22
  • Why don't they add this to base R? I would use it more than toupper. – MadmanLee Dec 27 '17 at 23:21
  • does not deal well with NAs, when sapplying to a vector – Ferroao Jan 27 '18 at 13:10
  • I can't imagine you apparently typed it over from the help page, since the code spacing on the help page IS readable opposed to yours. Common, `sep=""`, really? Use some spaces, man! – MS Berends Jul 09 '19 at 07:04
105

Match a regular expression that starts at the beginning ^ or after a space [[:space:]] and is followed by an alphabetical character [[:alpha:]]. Globally (the g in gsub) replace all such occurrences with the matched beginning or space and the upper-case version of the matched alphabetical character, \\1\\U\\2. This has to be done with perl-style regular expression matching.

gsub("(^|[[:space:]])([[:alpha:]])", "\\1\\U\\2", name, perl=TRUE)
# [1] "Zip Code"    "State"       "Final Count"

In a little more detail for the replacement argument to gsub(), \\1 says 'use the part of x matching the first sub-expression', i.e., the part of x matching (^|[[:spacde:]]). Likewise, \\2 says use the part of x matching the second sub-expression ([[:alpha:]]). The \\U is syntax enabled by using perl=TRUE, and means to make the next character Upper-case. So for "Zip code", \\1 is "Zip", \\2 is "code", \\U\\2 is "Code", and \\1\\U\\2 is "Zip Code".

The ?regexp page is helpful for understanding regular expressions, ?gsub for putting things together.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Martin Morgan
  • 45,935
  • 7
  • 84
  • 112
  • 12
    bah! I originally went down this path, but mistakenly was using `\\u` and gave up before realizing I should have capitalized it...somewhat ironic. Here's what I came up with, not thoroughly vetted against a odd ball cases `gsub(pattern = "\\b([a-z])", replacement = "\\U\\1", name, perl = TRUE)` – Chase Jun 15 '11 at 23:09
  • I tried to use this on row names and it worked once but I couldn't repeat it. – dpel Aug 23 '16 at 07:11
  • 1
    Works on `tolower(name)` if there are other caps – MichaelChirico Oct 06 '16 at 03:18
  • @MichaelChirico has an important point here: "HELLO" and "hELLO" will both return "HELLO", while "hello" returns "Hello". Instead, `name` should preferably be updated to `tolower(name)`, in which "HELLO", "hELLO", and "hello" will all return "Hello". – Christian Feb 05 '21 at 07:32
88

Use this function from stringi package

stri_trans_totitle(c("zip code", "state", "final count"))
## [1] "Zip Code"      "State"       "Final Count" 

stri_trans_totitle("i like pizza very much")
## [1] "I Like Pizza Very Much"
gagolews
  • 12,836
  • 2
  • 50
  • 75
bartektartanus
  • 15,284
  • 6
  • 74
  • 102
  • 25
    The stringr package (if the tidyverse is your thing) wraps the `stri_tans_totitle` into a function named `str_to_title()`. It's just the stringi::stri_trans_totitle() under the covers, but might save loading another library (that you may, in essence, already have loaded), depending on your workflow. – crazybilly Sep 21 '16 at 13:52
  • 2
    The `str_to_title()` function does unexpected things. E.g. it turns the name McEwan into Mcewan. – markus Dec 17 '20 at 14:01
58

Alternative:

library(stringr)
a = c("capitalise this", "and this")
a
[1] "capitalise this" "and this"       
str_to_title(a)
[1] "Capitalise This" "And This"   
Brijesh
  • 776
  • 5
  • 9
21

Try:

require(Hmisc)
sapply(name, function(x) {
  paste(sapply(strsplit(x, ' '), capitalize), collapse=' ')
})
diliop
  • 9,241
  • 5
  • 28
  • 23
16

From the help page for ?toupper:

.simpleCap <- function(x) {
    s <- strsplit(x, " ")[[1]]
    paste(toupper(substring(s, 1,1)), substring(s, 2),
          sep="", collapse=" ")
}


> sapply(name, .simpleCap)

zip code         state   final count 
"Zip Code"       "State" "Final Count"
Chase
  • 67,710
  • 18
  • 144
  • 161
9

The package BBmisc now contains the function capitalizeStrings.

library("BBmisc")
capitalizeStrings(c("the taIl", "wags The dOg", "That Looks fuNny!")
    , all.words = TRUE, lower.back = TRUE)
[1] "The Tail"          "Wags The Dog"      "That Looks Funny!"
Dirk
  • 1,134
  • 2
  • 12
  • 21
6

Alternative way with substring and regexpr:

substring(name, 1) <- toupper(substring(name, 1, 1))
pos <- regexpr(" ", name, perl=TRUE) + 1
substring(name, pos) <- toupper(substring(name, pos, pos))
greg L
  • 4,034
  • 1
  • 19
  • 18
5

You could also use the snakecase package:

install.packages("snakecase")
library(snakecase)

name <- c("zip code", "state", "final count")
to_title_case(name)
#> [1] "Zip Code"    "State"       "Final Count"

# or 
to_upper_camel_case(name, sep_out = " ")
#> [1] "Zip Code"    "State"       "Final Count"

https://github.com/Tazinho/snakecase

Taz
  • 546
  • 5
  • 9
2

This gives capital Letters to all major words

library(lettercase)
xString = str_title_case(xString)
Jonny
  • 1,319
  • 1
  • 14
  • 26
Cole Davis
  • 21
  • 3
2

Another version using StrCap in DescTools

Text = c("This is my phrase in r", "No, this is not my phrase in r")

DescTools::StrCap(Text) # Default only first character capitalized
[1] "This is my phrase in r"         "No, this is not my phrase in r"

DescTools::StrCap(Text, method = "word") # Capitalize each word
[1] "This Is My Phrase In R"        "No This Is Not My Phrase In R"

> DescTools::StrCap(Text, method = "title") # Capitalize as in titles
[1] "This Is My Phrase in R"         "No, This Is Not My Phrase in R"
Chriss Paul
  • 1,101
  • 6
  • 19
2

✓ one line
✓ one existing function; no new package
✓ works on list/all words
✓ capitalizes the first letter AND lowers the rest of the word :

name <- c("zip CODE", "statE", "final couNt")
gsub("([\\w])([\\w]+)", "\\U\\1\\L\\2", name, perl = TRUE)
[1] "Zip Code"    "State"       "Final Count"

If you plan on using it a lot, I guess you could make a wrapper function with it :

capFirst <- function(x) gsub("([\\w])([\\w]+)", "\\U\\1\\L\\2", x, perl = TRUE)
capFirst(name)

If you have special letters you can use this reprex instead:

capFirst <- function(x) gsub("(\\p{L})(\\p{L}+)", "\\U\\1\\L\\2", x, perl = TRUE)
capFirst(name)

Except perl doesn't know how to make it upper- or lowercase after... So there's always:

stringi::stri_trans_totitle(c("zip CODE", "éTAts", "final couNt"))
#[1] "Zip Code"    "États"       "Final Count"
Salix
  • 1,290
  • 9
  • 15
  • 1
    this works with base ASCII characters, but fails on accented European letters: `"Réchy" -> "RéChy"` – Dima Lituiev Dec 17 '22 at 10:42
  • 1
    calisse >< forgot those were special characters. You can do a unicode range though to cover those in the reprex like : "([\\w|\u00C0-\u01FF])([\\w|\u00C0-\u01FF]+)". – Salix Dec 22 '22 at 19:57
1

Here is a slight improvement on the accepted answer that avoids having to use sapply(). Also forces non-first characters to lower.

titleCase <- Vectorize(function(x) {
  
  # Split input vector value
  s <- strsplit(x, " ")[[1]]
  
  # Perform title casing and recombine
  ret <- paste(toupper(substring(s, 1,1)), tolower(substring(s, 2)),
        sep="", collapse=" ")
  
  return(ret)
  
}, USE.NAMES = FALSE)


name <- c("zip CODE", "statE", "final couNt")

titleCase(name)

#> "Zip Code"       "State" "Final Count" 

David J. Bosak
  • 1,386
  • 12
  • 22
1

This might be of use to some. If the word is capitalized one first has to make it lowercase.

tools::toTitleCase("FRANCE")
[1] "FRANCE"

as opposed to

tools::toTitleCase(tolower("FRANCE"))
[1] "France"
m_c
  • 496
  • 2
  • 19