4

I am trying to use the replace() function in dplyr to clean my data. I want to run it on all the columns except one. If I use a select() statement before I lose my character identifiers. I am looking for something like this

newdata<-data %>% replace(((.)>1000),0)

But with an exception

newdata<-data %>% replace(((-StoreID)>1000),0)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
user2502836
  • 703
  • 2
  • 6
  • 6

2 Answers2

12

Since you didn't provide a reproducible example, here's how it would work on the iris dataset:

iris %>% mutate_each(funs(replace(., . > 5, NA)), -Species)

We use mutate_each() to replace by NA the values greater than 5 in all columns except Species


For your example it would be something like:

data %>% mutate_each(funs(replace(., . > 1000, 0)), -StoreID)
Community
  • 1
  • 1
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77
  • out of curiosity, how something similar be used for replace_na instead of replace ? – Courvoisier Jun 02 '17 at 15:31
  • what if , I want to run this on only one column? will `data %>% mutate_each(funs(replace(., . > 1000, 0)), StoreID)` work? – kRazzy R Oct 25 '17 at 15:01
  • is it possible to use `mutate_at` with a modified .vars argument? Something like this: `iris %>% mutate_at(.tbl=., .vars=names(.), funs(replace(., . > 5, NA)))` where the `Species` is substracted from the `names(.)` - anybody can think of a way?? – Agile Bean Apr 18 '19 at 02:18
1

mutate_each was deprecated as of dplyr version 0.7.0. Here's an updated answer using across:

iris %>% mutate(across(-Species, ~replace(., . > 5, NA)))
NovaEthos
  • 500
  • 2
  • 10