12

Colleagues,

I'm looking at a data frame resembling the extract below:

Month   Provider Items
January CofCom   25
july    CofCom   331
march   vobix    12
May     vobix    0

I would like to capitalise first letter of each word and lower the remaining letters for each word. This would result in the data frame resembling the one below:

Month   Provider Items
January Cofcom   25
July    Cofcom   331
March   Vobix    12
May     Vobix    0

In a word, I'm looking for R's equivalent of the ROPER function available in the MS Excel.

Konrad
  • 17,740
  • 16
  • 106
  • 167

4 Answers4

31

With regular expressions:

x <- c('woRd Word', 'Word', 'word words')
gsub("(?<=\\b)([a-z])", "\\U\\1", tolower(x), perl=TRUE)
# [1] "Word Word"  "Word"       "Word Words"

(?<=\\b)([a-z]) says look for a lowercase letter preceded by a word boundary (e.g., a space or beginning of a line). (?<=...) is called a "look-behind" assertion. \\U\\1 says replace that character with it's uppercase version. \\1 is a back reference to the first group surrounded by () in the pattern. See ?regex for more details.

If you only want to capitalize the first letter of the first word, use the pattern "^([a-z]) instead.

Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113
13

The question is about an equivalent of Excel PROPER and the (former) accepted answer is based on:

proper=function(x) paste0(toupper(substr(x, 1, 1)), tolower(substring(x, 2)))

It might be worth noting that:

proper("hello world")
## [1] "Hello world"

Excel PROPER would give, instead, "Hello World". For 1:1 mapping with Excel see @Matthew Plourde.

If what you actually need is to set only the first character of a string to upper-case, you might also consider the shorter and slightly faster version:

proper=function(s) sub("(.)", ("\\U\\1"), tolower(s), pe=TRUE)
antonio
  • 10,629
  • 13
  • 68
  • 136
11

Another method uses the stringi package. The stri_trans_general function appears to lower case all letters other than the initial letter.

require(stringi)
x <- c('woRd Word', 'Word', 'word words')
stri_trans_general(x, id = "Title")
[1] "Word Word"  "Word"       "Word Words"
lawyeR
  • 7,488
  • 5
  • 33
  • 63
  • 3
    For future visitors: stringi has a function called `stri_trans_totitle` which does the same thing. Not sure if that existed at the time of this answer. – IceCreamToucan May 22 '18 at 18:59
5

I dont think there is one, but you can easily write it yourself

(dat <- data.frame(x = c('hello', 'frIENds'),
                   y = c('rawr','rulZ'),
                   z = c(16, 18)))
#         x    y  z
# 1   hello rawr 16
# 2 frIENds rulZ 18

proper <- function(x)
  paste0(toupper(substr(x, 1, 1)), tolower(substring(x, 2)))


(dat <- data.frame(lapply(dat, function(x)
  if (is.numeric(x)) x else proper(x)),
  stringsAsFactors = FALSE))

#         x    y  z
# 1   Hello Rawr 16
# 2 Friends Rulz 18

str(dat)
# 'data.frame':  2 obs. of  3 variables:
#   $ x: chr  "Hello" "Friends"
#   $ y: chr  "Rawr" "Rulz"
#   $ z: num  16 18
rawr
  • 20,481
  • 4
  • 44
  • 78
  • Thank you, this is what I was looking for. It's such a nice thing that should be part of the base :) – Konrad Jul 25 '14 at 14:21
  • Just one word of caution that available numeric column in the function was changed to factor after I applied this function, which messed up the chart a little so I had to make it numeric again. – Konrad Jul 25 '14 at 14:30
  • @Konrad that that case, I would, `data.frame(lapply(dat, function(x) if(is.numeric(x)) x else proper(x)))` or something similar – rawr Jul 25 '14 at 14:35
  • Thank you very much, it's very useful solution. I'm wondering whether it would be sensible to move `if(is.numeric` part to the function itself. – Konrad Jul 25 '14 at 15:43
  • you could do that, too. you could also expand the function to handle different classes in different ways – rawr Jul 25 '14 at 16:10