1

I have this dataset:

datasample <- data.frame(id = c(1,2,3),
                         prod = c("__eggs", 
                                  "fresh <br> turkey", 
                                  "meat tosón"))

And I want to clean the prod variable. In this case to extract the <br> tag and to convert the "tosón" text to "toson". I read in SO the solution for the accented characters: Convert accented characters into ascii character so I tried this code:

data <- datasample %>% 
  mutate(prod = str_replace(prod, "__", ""),
         prod = str_replace(prod, "<br>", ""),
         prod = iconv(prod, to="ASCII//TRANSLIT"))

But this is the result:

data

  id              prod
1  1              eggs
2  2 fresh <br> turkey
3  3        meat tosón

Please, do you know what I am doing wrong?

Alexis
  • 2,104
  • 2
  • 19
  • 40
  • When I run the code the value is changed in the output. I cannot replicate the behavior you are seeing. Are you looking at the output of the `mutate()` call? Or are you looking at the original `datasample` value. `mutate()` will not update the existing data.frame, it will return a new one. What OS are you using? What version of R are you running? – MrFlick Feb 15 '22 at 20:22
  • Hello @MrFlick, I changed the code to see if the data.frame updating worked, but I have the same result. I use a Mac, and the version is 4.1.0 – Alexis Feb 15 '22 at 20:28
  • @Alexis I'm on Mac, but I get the correct result. Unsure why this is happening. Voting to close because it's not reproducible – Mark Aug 01 '23 at 05:54

0 Answers0