0

I am a little confused about mutate function in dplyr. I have a binary data frame and I want to create a new column with mutate. Here is what I want to do:

author_dist <-
  dist_df %>%
  mutate_each(funs(gsub("T.*","",.)),-distance) %>% 
  mutate_each(funs(ifelse(nchar(.)<3,gsub("A","A0",.),.)),-distance) %>% 
  group_by(Sentence1,Sentence2) %>% 
  summarise(avg_dist=mean(distance)) %>% 

but when I try to run the code I get a warning:

mutate_each() is deprecated. Use mutate_all(), mutate_at() or mutate_if() instead. To map funs over a selection of variables, use mutate_at() mutate_each() is deprecated. Use mutate_all(), mutate_at() or mutate_if() instead. To map funs over a selection of variables, use `mutate_at

I tried mutate_at and mutate_all but no luck.

Case1:

author_dist <-
  dist_df %>%
  mutate_at(vars(gsub("T.*","",.)),-distance) %>% 
  mutate_at(vars(ifelse(nchar(.)<3,gsub("A","A0",.),.)),-distance) %>% 
  group_by(Sentence1,Sentence2) %>% 
  summarise(avg_dist=mean(distance)) %>% 

Error: Strings must match column names. Unknown columns: c("A1, c(1, 1, 0.75, 1, 0.875, 1, 0, 0.8, 0.6, 1, 1, 0.875, 0.8, 0.857142857142857, 1, 1, 0.75, 1, 0.75, 1, 1, 0.8, 0.666666666666667, 1, 1, 0.888888888888889, 0.9, 1, 0.8, 1, 1, 1, 1, 0.666666666666667, 1, 0.75, 0.666666666666667, 0.666666666666667, 1, 0.75, 1, 0.666666666666667, 1, 1, 1, 1, 0.833333333333333, 1, 1, 0.75, 0.8, 1, 1, 1, 1, 0.5, 1, 0.666666666666667, 1, 1, 1, 0.8, 0.8, 1, 0.8, 1, 0.8, 1, 0.8, 1, 1, 0.8, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.666666666666667, 1, 1, 1, 1, 1, 1, 0.833333333333333, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.75, 0.75, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.666666666666667, 1, 1, 1, 1, 0.666666666666667, 1, 0.666666666666667, 1, 0.875, 0.75, 1, 1, 1, 1, 1, 0.857142857142857, 1, 0.333333333333333, 1, 1, 0.666666666666667, 1, 0.833333333333333, 1, 0.888888888888889, 1, 1, 0.75, 1, 1, 0.875, 1, 1, 1, 1, 0.8,

Case2:

author_dist <-
  dist_df %>%
  mutate_all(vars(gsub("T.*","",.)),-distance) %>% 
  mutate_all(vars(ifelse(nchar(.)<3,gsub("A","A0",.),.)),-distance) %>% 
  group_by(Sentence1,Sentence2) %>% 
  summarise(avg_dist=mean(distance)) %>% 

Error in eval_bare(dot$expr, dot$env) : object 'distance' not found

Sorry about my question I am a bit novice to statistics. Any ideas would be appreciated.

Samuel
  • 2,895
  • 4
  • 30
  • 45
  • 2
    read the results of `?mutate_each` and `?mutate_at`. `mutate_each` was deprecated, meaning that it is no longer recommended to be used in favour of its `_at` and `_all` replacements. `_all` applies the function to all variables, `_at` allows you to specify some variables to apply the function to, but the syntax is not quite the same. When you say "i tried mutate_at and mutate_all but no luck" show us what you did and what went wrong. – Calum You Feb 28 '18 at 19:23
  • 1
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Show the code you tried "with no luck". – MrFlick Feb 28 '18 at 19:47
  • Why do you end your code with the pipe operator? Is there something else coming after the last `%>%`? – Samuel Feb 28 '18 at 20:56
  • Yes visualisation codes come afterwards. I only copied the problematic part i forgot to delete it. –  Feb 28 '18 at 21:00
  • Look at the examples for `mutate_at` and `mutate_all` given in the documention. `mutate_at(vars(gsub("T.*","",.)),-distance)` does not make sense. `vars()` uses helper functions such as `matches` if you want to select variables with regular expressions, you can't just throw `gsub` in. `-distance` is not a function. This explains both your errors. – Calum You Mar 01 '18 at 13:25

0 Answers0