I have repetitive code in dplyr that cleans data.
df1_final$sumaryczna_kwota_zobowiązań <-
df1_final$sumaryczna_kwota_zobowiązań %>%
str_replace(",", ".") %>% str_replace_all("\\s", "")%>% as.numeric()
df3_final$sumaryczna_liczba_kontraktu_dla_produktu <-
df3_final$sumaryczna_liczba_kontraktu_dla_produktu %>%
str_replace(",", ".") %>% str_replace_all("\\s", "")%>% as.numeric()
df3_final$sumaryczna_kwota_kontraktu_dla_produktu <-
df3_final$sumaryczna_kwota_kontraktu_dla_produktu %>%
str_replace(",", ".") %>% str_replace_all("\\s", "") %>% as.numeric()
df3_final$średnia_cena_produktu <-
df3_final$średnia_cena_produktu %>%
str_replace(",", ".") %>% str_replace_all("\\s", "") %>% as.numeric()
It is one column in one df, three columns in another df, but the process is the same.
How to turn it into a function, that takes one, or better, several columns in a dataframe and cleans the data, without repeating the code?
TO MODERATOR, EXPLANATION: my question is unique in the sense it asks for several piped operations on several columns. The answers in the comments deserve promoting. From them I learned the syntax:
myfun = . %>% str_replace(",", ".") %>% str_replace_all("\\s", "")%>% as.number()
# and then use it on columns name "a" and "b"
df %<>% mutate_at(c("a","b"), .funs=myfun)