How to remove number or text elements from all columns

Question

Dataset used in this question is "Wage" from ISLR package

    library(ISLR)

    head(Wage)

   year age           maritl     race       education             region       jobclass         health
1 2006  18 1. Never Married 1. White    1. < HS Grad 2. Middle Atlantic  1. Industrial      1. <=Good
2 2004  24 1. Never Married 1. White 4. College Grad 2. Middle Atlantic 2. Information 2. >=Very Good
3 2003  45       2. Married 1. White 3. Some College 2. Middle Atlantic  1. Industrial      1. <=Good
  health_ins  logwage      wage
1      2. No 4.318063  75.04315
2      2. No 4.255273  70.47602
3     1. Yes 4.875061 130.98218

3rd column to 9th column contains unwanted characters (first element) such as 1. or 2.

How to remove all unwanted characters and numbers for all mentioned columns

Hi Tuyen, have a look here https://stackoverflow.com/help/how-to-ask and here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and revise your question. Also have a look at http://tidyverse.org/ for your direct problem — Jan, Aug 27 '17 at 10:18

CPak · Accepted Answer · 2017-08-27T10:45:56.477

1

mutate all "[1-9]. "

library(dplyr)
temp <- Wage
ans <- temp %>% 
         mutate_at(3:9, funs(sub("\\d. ", "", .)))

Output

head(ans)

  year age        maritl  race    education          region    jobclass      health
1 2006  18 Never Married White    < HS Grad Middle Atlantic  Industrial      <=Good
2 2004  24 Never Married White College Grad Middle Atlantic Information >=Very Good
3 2003  45       Married White Some College Middle Atlantic  Industrial      <=Good
4 2003  43       Married Asian College Grad Middle Atlantic Information >=Very Good
5 2005  50      Divorced White      HS Grad Middle Atlantic Information      <=Good
6 2008  54       Married White College Grad Middle Atlantic Information >=Very Good
  health_ins  logwage      wage
1         No 4.318063  75.04315
2         No 4.255273  70.47602
3        Yes 4.875061 130.98218
4        Yes 5.041393 154.68529
5        Yes 4.318063  75.04315
6        Yes 4.845098 127.11574

edited Aug 27 '17 at 10:45

answered Aug 27 '17 at 10:22

CPak

13,260
3
30
48

Thanks @Chi Pak. But any other ways to remove 4.,3., or many others wuthout writing too many " mutate_at(3:9, funs(sub("1. ", "", .)))" – Tuyen Aug 27 '17 at 10:28
@Tuyen You could also do it like this: `temp %>% mutate_at(3:9, funs(sub("[12]. ", "", .)))` – Jaap Aug 27 '17 at 10:33
Hi @ChiPak What does it mean when you put . after " " in temp %>% mutate_at(3:9, funs(sub("[12]. ", "", .))) – Tuyen Aug 27 '17 at 10:44
[piping](https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html) – CPak Aug 27 '17 at 10:48

How to remove number or text elements from all columns

1 Answers1