0

I have a character object in R which is made of 200 words (a paragraph). I need to find all the cases that are a combination of number-space-character in all over the text and replace it with number-dash-character. For example, if the text says "the summer of 2019 was the hottest summer ever", I need to convert it to "the summer of-2019 was the hottest summer ever". What is an efficient way to do this (actually I have to do this to thousands of paragraphs)?

text <- "the summer of 2019 was the hottest summer ever"
some_function(text)
[1] "the summer of-2019 was the hottest summer ever"

Fred
  • 135
  • 7
  • Please make this question *reproducible*. This includes sample code (including listing non-base R packages), sample *unambiguous* data (e.g., `dput(head(x))` or `data.frame(x=...,y=...)`), and expected output. Refs: https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. – r2evans Sep 12 '19 at 20:57
  • `gsub("(\\D)\\s(\\d)", "\\1-\\2", text)` – r2evans Sep 12 '19 at 21:12
  • `gsub("(?<=\\D)\\s(?=\\d)", "-", text, perl=TRUE)` – r2evans Sep 12 '19 at 21:13
  • For clarity, though, your subject says *"number-space-character"* but your example is *"character-space-number"* ... just sayin' ... – r2evans Sep 13 '19 at 04:17

1 Answers1

0

Your question is a little ambiguous, but based on your example, the following should work. It's looking for all spaces (\\s) preceded by an alphabetic character([:alpha:]), followed by a digit (\\d).

library(stringr)

test_string <- "the summer of 2019 was the hottest summer ever

str_replace_all(test_string, "(?<=[:alpha:])\\s(?=\\d)", "-")

Rogers
  • 1