0

Is it possible to apply function in gsub replacement phrase ? Let's say after str_to_title we have

This Is One Hell Of A Blahblah Cake

I would like ignore certain words from the effect of str_to_title function, so that I would have

This is one Hell of a blahblah Cake

I am aware that str_to_title has its own list of exception, but I would like to customize that list by reverting some phrase back to lowercase.

My approach at the moment is

gsub("( Is | One | BlahBlah )", tolower("\\1"), str_to_title(x))

but gsub will not see the tolower function. An idea how t achieve this ? How can we replace regex with a function acting on the matched string ?

Kenny
  • 1,902
  • 6
  • 32
  • 61
  • Are you sure they were lowercase in the first place? You should not follow this approach. You may just use the `tools::toTitleCase` code and modify it by adding your exceptions. – Wiktor Stribiżew Nov 02 '17 at 14:45
  • 2
    Perhaps you're looking for the package **gsubfn**. – joran Nov 02 '17 at 14:50

1 Answers1

2

You can prefix the replacement with \\L to convert them to lower case:

s = "This Is One Hell Of A Blahblah Cake"

gsub("(\\bIs\\b|\\bOne\\b|\\bBlahblah\\b)", "\\L\\1", s, perl = T)
# [1] "This is one Hell Of A blahblah Cake"

Or as commented @joran, you can use gsubfn package:

library(gsubfn)
options(gsubfn.engine = "R")
gsubfn("\\bIs\\b|\\bOne\\b|\\bBlahblah\\b", ~ tolower(x), s)
# [1] "This is one Hell Of A blahblah Cake"
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • `gsub("(\\b(Is|One|Blahblah)\\b", "\\L\\1", s, perl = T)` will also do. I'd put ignore.case=T to cover larger scope. And `perl=T` must be there for it to work. Can you shed some light on this `perl=T` ? – Kenny Nov 02 '17 at 15:28
  • 1
    Agree `\\b(Is|One|Blahblah)\\b` is a more concise one. And `perl=TRUE` allows you use *PCRE regular expressions library* as from the [documentation](https://www.regular-expressions.info/rlanguage.html), which is usually more powerful than base R regex engine. In this case, we need `perl=TRUE` to use `\\L` modifier. – Psidom Nov 02 '17 at 15:36
  • @Kenny `\L` operator is not part of PCRE, it is an extension built into the PCRE library for R. It is implemented in Boost regex library that is based on PCRE. – Wiktor Stribiżew Nov 03 '17 at 00:02