2

In R, I want to loop over items to replace in a column.

the input into the function is a list of items and I want to return the list after removing any and all items within itemsToBeRemoved list.

removePunctuation <- function(punctuationObject){
    itemsToBeRemoved <- list(".", ",", ";", ":", "'", "!", "#", "-", "--")
    objectApplyTo <- punctuationObject
    for (itemToReplace in itemsToBeRemoved){
        resultObject <- gsub("itemToReplace", "", objectApplyTo, fixed=TRUE)
        return(resultObject)   
    }
}

I expect all instances of ".", ",", ";", ":", "'", "!", "#", "-", "--" to be removed from a list of character elements.

Chris
  • 495
  • 1
  • 9
  • 26
  • Check out package `stringr`, vectorized function `str_remove_all`. – Rui Barradas Jul 09 '19 at 19:15
  • Also, your code has 2 errors: 1) `gsub("itemToReplace", etc)` with quotes. They are not needed, like this you are removing the *string* `"itemToReplace"` not the *variable* `itemToReplace`. 2) It returns after the first time through the loop. – Rui Barradas Jul 09 '19 at 19:17
  • Thanks, that's my constant manipulating it to work. I removed the quotes and placed return at end of loop but still nothing was removed. – Chris Jul 09 '19 at 19:24
  • '''removePunctuation <- function(punctuationObject){ itemsToBeRemoved <- list(".", ",", ";", ":", "'", "!", "#", "-", "--") objectApplyTo <- punctuationObject for (itemToReplace in itemsToBeRemoved){ resultObject <- gsub(itemToReplace, "", objectApplyTo, fixed=TRUE) } return(resultObject) }''' – Chris Jul 09 '19 at 19:24
  • Why not just `gsub("[[:punct:]]", "", string)`? – jay.sf Jul 09 '19 at 19:25
  • just used the [[:punct:]] method. That worked but I need flexibility to remove other items that might be specific strings – Chris Jul 09 '19 at 19:26
  • 1
    @Chris you can add them with `|`. Try `gsub("[[:punct:]]|fox", "", "The.Quick,brown;fox:jumps'over!the#lazy-dogs--")`. – jay.sf Jul 09 '19 at 19:28
  • That does work and I added other items to get more familiar against my own data set. I'm still a bit confused how I could remove just a period and a backslash. Would I just use "\\." and "\\\" to specifically remove a period or backslash? – Chris Jul 09 '19 at 19:52
  • @Chris Yes also separated by `|`, backslashes are nasty, though, `gsub("\\\\|\\.", "", "Do \\re mi.")`, see [this answer](https://stackoverflow.com/a/25427271/6574038). – jay.sf Jul 10 '19 at 09:34

2 Answers2

2

A base R solution could be

removePunctuation <- function(punctuationObject){
  itemsToBeRemoved <- c(".", ",", ";", ":", "'", "!", "#", "-", "--")
  resultObject <- punctuationObject
  for (itemToReplace in itemsToBeRemoved){
    resultObject <- gsub(itemToReplace, "", resultObject, fixed = TRUE)
  }
  resultObject
}

x <- c("This, that; end.", "Second: single quote' etc !")

removePunctuation(x)
#[1] "This that end"            "Second single quote etc "
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
1

You have several problems, one of them is that if you want to make it work in a list, you are overriding constantly the values of it. Also the pattern "." it's problematic for you. Because it takes it as the "." wildcard, not just a plain dot. Check this:

removePunctuation <- function(punctuationObject){
  itemsToBeRemoved <- list("\\.", ",", ";", ":", "'", "!", "#", "-", "--")
  for (item in itemsToBeRemoved){
    punctuationObject <- gsub(item, "", punctuationObject)
    print(punctuationObject)

  }
  return(punctuationObject)  
}

punctuationObject <- list("a,", "b", "c#")


removePunctuation(punctuationObject)
user123
  • 175
  • 4
  • 16