0

I have the following DF:

id<-c("id1","id2","id3","id4","id5","id6")
out<-c("50","60","60 4d", "60.4","5823",NA)
cov<-c("Male","male","mále","Fe male","female","fema")
dat<-data.frame(id,out,cov)

I create two functions to help me organize and clean my df:

conv_number<-function(data,variable){
  data<- data |> dplyr::mutate(variable = gsub(pattern = ",", replacement = ".", variable))
  x<- data |> dplyr::mutate(variable = as.numeric(gsub("[^0-9.-]", "", variable)))
  return (x)
}

clean_string<-function(data,variable){
  data |> dplyr::mutate(variable = tolower(variable))
  x<- data |> dplyr::mutate(variable = gsub("[^a-z]", "", variable))
  return (x)
}

My intention with those functions is that they take a column of a dataset and make some transformations in the same column. So I use them like this:

prueba_1<-conv_number(dat,out)
prueba_1<-clean_string(dat,cov)

However this is not what they do, they create a new column called "variable". And, of course, in the second example the variable does not transform the characters to tolower.

What I'm missing here? Perhaps there is something wrong with the dplyr::mutate() function?

Francisco
  • 119
  • 6
  • Does this answer your question? [Use dynamic name for new column/variable in \`dplyr\`](https://stackoverflow.com/questions/26003574/use-dynamic-name-for-new-column-variable-in-dplyr) – SamR Mar 23 '23 at 11:35
  • Thanks for the help! I see the point there, however since I pass as an argument the name of the column I'm not able of changing the same column in the dataframe. I'm not sure if I explain myself correctly. In my case I want to change that same column that I pass in the argument variable. – Francisco Mar 23 '23 at 11:41

1 Answers1

1

You can read more about quasiquotation to better understand how to do this, but here's an option of using the curly-curly braces:

library(dplyr)

id<-c("id1","id2","id3","id4","id5","id6")
out<-c("50","60","60 4d", "60.4","5823",NA)
cov<-c("Male","male","mále","Fe male","female","fema")
dat<-data.frame(id,out,cov)

conv_number<-function(data,variable){
  data<- data |> 
    dplyr::mutate({{variable}} := gsub(pattern = ",", replacement = ".", {{variable}}))
  
  x <- data |> 
    dplyr::mutate({{variable}} := as.numeric(gsub("[^0-9.-]", "", {{variable}})))
  
  return (x)
}

clean_string<-function(data,variable){
  
  data <- data |> 
    dplyr::mutate({{variable}} := tolower({{variable}}))
  
  x <- data |> 
    dplyr::mutate({{variable}} := gsub("[^a-z]", "", {{variable}}))
  
  return (x)
}


conv_number(dat,out)
#>    id    out     cov
#> 1 id1   50.0    Male
#> 2 id2   60.0    male
#> 3 id3  604.0    mále
#> 4 id4   60.4 Fe male
#> 5 id5 5823.0  female
#> 6 id6     NA    fema


clean_string(dat,cov)
#>    id   out    cov
#> 1 id1    50   male
#> 2 id2    60   male
#> 3 id3 60 4d    mle
#> 4 id4  60.4 female
#> 5 id5  5823 female
#> 6 id6  <NA>   fema
Matt
  • 7,255
  • 2
  • 12
  • 34