0

I want to make 5 commands on list of variables. In my real data I have a lot more but for the purpose of this example, say I have two. What I want to do - written out - is this:

data$inst <- tolower(data$inst)
data$inst <- str_replace(data$inst, "è|ê", "e")
data$inst <- str_replace(data$inst, "à|â", "a")
data$inst <- str_replace(data$inst, "’|`", " ")
data$inst <- str_replace(data$inst, "  ", " ")

data$inst_city <- tolower(data$inst_city)
data$inst_city <- str_replace(data$inst_city, "è|ê", "e")
data$inst_city <- str_replace(data$inst_city, "à|â", "a")
data$inst_city <- str_replace(data$inst_city, "’|`", " ")
data$inst_city <- str_replace(data$inst_city, "  ", " ")

I am sure they must be a smart (read: loop) way of doing this, but this does not manipulate the variables, in the way I want:

list <- c('data$inst', 'data$inst_city')
for (var in list) {
  var <- tolower(var)
  var <- str_replace(var, "è|ê", "e")
  var <- str_replace(var, "à|â", "a")
  var <- str_replace(var, "’|`", " ")
  var <- str_replace(var, "  ", " ")
}

So how do I make a loop on the variables inst and inst_city in data ? As a side note, I am fairly new to R and suspect this is a situation where the pipe operator would be handy, but can't quite figure out how.

EDIT: I can see that my question has been closed, because there are questions answered already on how to replace the contents of accented letters with non-accented letters. Thanks for providing the references. For the purposes of learning R, however, I am still very interested in learning how to loop over variables, or how to solve my problem using a loop. This is because I am very interested learning in how looping works in R compared to Stata.

EmilA
  • 11
  • 2
  • `R` doesn't usually update variables in place, some other languages really lean into that pattern. You code looks good the only thing is you have to allocated the newly modified string to a new object, maybe `list2.append(var)` or use positions in the original `list[[1]] = var` up to you. – Nate Mar 16 '23 at 12:47
  • "some other languages really lean into that pattern" my use of Stata is exactly why I am looking for this kind of solution, here I would simply write foreach x inst city{ data$`x' <- tolower(data$`x') } – EmilA Mar 16 '23 at 12:52
  • Not sure what your specific needs are, but as far as the special accented characters, one could do it in once with `stringi::stri_trans_general("èêàâï", "Latin-ASCII")` which would return `eeaai`. For the other characters, are those really the only ones you want to convert to spaces? – Merijn van Tilborg Mar 16 '23 at 12:57
  • yea well just for something like `for position in 1:length(list){`list[[position]] = var}` The "iterate over length and update by index" is a classic R pattern to get around the lack of inplace updates – Nate Mar 16 '23 at 13:08
  • @MerijnvanTilborg "Not sure what your specific needs are, but as far as the special accented characters, one could do it in once with stringi::stri_trans_general("èêàâï", "Latin-ASCII") which would return eeaai" Thanks a lot! "For the other characters, are those really the only ones you want to convert to spaces?" So far yes, but ultimately, probably no. – EmilA Mar 16 '23 at 14:12

0 Answers0