I have a huge data frame, with some columns containing "characters". The problem is that I have some "wrong" characters, like this:
mutate_all(data, funs(tolower))
> Error in mutate_impl(.data, dots) : Evaluation error: invalid input
> 'https://www.ps.f/c-w/nos-promions/v-ambght-rembment.html#modalit<e9>s'
> in 'utf8towcs'.
So I deleted the "wrong" characters (note: I can't just easily remove all the characters, because I need the ":" to separate the data).
I found an solution:
library(qdap)
keep <- c(":")
data$column <- strip(data$column, keep, lower = TRUE)
See: How to remove specific special characters in R
That worked... but it is really slow. So therefore my question: how can I apply a function on all my columns (columns that are character) which is quicker then what I just did?
EDIT
Some example what happened in my script:
View(data$column)
"CP:main:234e5qhaw/00:lcd-monitor-with-smatimge-lite"
"CP:main:234e5qhaw/00:lcd-monitor-with-smarimge-lite"
"CP:main:234e5qhaw/00:lcd-monitor-with-sartimge-lite"
"CP:main:bri953/00:faq:skça_sorulan_sorular:xc000003329:f02:9044:9512"
tolower(data$column)
Error in tolower(data$column) :
invalid input "CP:main:bri953/00:faq:skça_sorulan_sorular:xc000003329:f02:9044:9512" in 'utf8towcs'
Optimal situation: keep as much as possible from the original data. But I can imagine that "special" characters must be replaced. But I really need to keep the ":" to separate the data in a later stage.