21

I'm working with an imported data set that corresponds to the extract below:

set.seed(1)
dta <- data.frame("This is Column One" = runif(n = 10),
                     "Another amazing Column name" = runif(n = 10),
                     "!## This Columns is so special€€€" = runif(n = 10),
                    check.names = FALSE)

I'm doing some cleaning on this data using dplyr and I would like to change column names to syntatically correct ones and remove the punctuation as a second step. What I tried so far:

dta_cln <- dta %>% 
    rename(make.names(names(dta)))

generates an error:

> dta_clean <- dta %>% 
+     rename(make.names(names(dta)))
Error: All arguments to rename must be named.

Desired result

What I wan to achieve can be done in base:

names(dta) <- gsub("[[:punct:]]","",make.names(names(dta)))

which would return:

> names(dta)
[1] "ThisisColumnOne"          "AnotheramazingColumnname" "XThisColumnsissospecial"

I want to achieve the same effect but using dyplr and %>%.

Konrad
  • 17,740
  • 16
  • 106
  • 167
  • Looks like some tweaking of [this](http://stackoverflow.com/questions/30382908/r-dplyr-rename-variables-using-string-functions) – akrun Dec 04 '15 at 16:07
  • @akrun Thanks very much, I will try to do something with `setNames(tolower(gsub("\\.","_",names(.))))` as suggested in the linked answer. – Konrad Dec 04 '15 at 16:09
  • Only problem is that some characters are not parsing well within the `rename`. – akrun Dec 04 '15 at 16:10
  • Yup: `Error in parse(text = x) : :1:9: unexpected symbol 1: Service Condiitions` – Konrad Dec 04 '15 at 16:11
  • After tweaking, [this](http://stackoverflow.com/a/30383251/1655567) will work. – Konrad Dec 04 '15 at 16:14
  • Yes, the standalone should work outside the `rename`. But, I understand the reason for making it right with the `dplyr` functions itself. – akrun Dec 04 '15 at 16:15
  • For some reason I've taken to doing this if I know I'm going to rename everything: `iris %T>% { colnames(.) <- paste0("iris_",names(.)) }`. Somewhat unclear to me why I prefer it to `rename`. – Akhil Nair Jul 13 '18 at 09:23

5 Answers5

35

I know this is an old question, and I'm sure you found the solution by now, but I stumbled here searching for the same question, and ultimately found a few new ways to do this.

Dplyr

Using dplyr 0.6.0 and above, there is now a rename_all function:

  dta %>% 
    rename_all(funs(gsub("[[:punct:]]", "", make.names(names(dta)))))

Which works, but it's a little messy to me. If you want more flexibility with dplyr, you can also call on:

  • rename_at
  • rename_if

Janitor

This is a pretty nice package (with plenty of additional utility) that can easily clean up column names:

library(janitor)

dta %>% 
  clean_names()

Which will rename and clean all column names to the following:

[1] "this_is_column_one"  "another_amazing_column_name"  "x_this_columns_is_so_special"

Everything becomes snake_case rather than CamelCase, but overall clean_names is very flexible in the column names it handles. If that IS a deal breaker, you can use yet another package snakecase for its function to_big_camel_case() within the rename_all function...although that is starting to get a little too esoteric

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
  • 4
    `funs()` is deprecated as of dplyr 0.8.0. Looks like you now want: `dta %>% rename_all(list(~ gsub("[[:punct:]]", "", .)))` or (since `rename_all()` has been superceded by `rename_with()` ... `dta %>% rename_with(~ gsub("[[:punct:]]", "", .x))` – Brian D Feb 17 '21 at 19:16
34

Set column names with the pipe like so:

iris %>% `colnames<-`(c("newcol1", "newcol2", "newcol3", "newcol4", "newcol5"))

Which returns

    newcol1 newcol2 newcol3 newcol4    newcol5
1       5.1     3.5     1.4     0.2     setosa
2       4.9     3.0     1.4     0.2     setosa
3       4.7     3.2     1.3     0.2     setosa
stevec
  • 41,291
  • 27
  • 223
  • 311
6
mtcars %>% 
  data.table::setnames(
    old = mtcars %>% names(),
    new = mtcars %>% names() %>% paste0("_new_name")
  )

The function setnames in data.table package is to rename the column names in data frame. old and new are two arguments in this function we need.

mtcars %>% names() outputs the column names of data frame mtcars in pipeline %>% way, so you can also use names(mtcars). They are same thing.

In this minimal example, I rename the column names in pipeline %>% and add all old column names with a postfix using paste0 function. You can add prefix, postfix or other rules.

Jiaxiang
  • 865
  • 12
  • 23
  • Please add some explanation to your answer. Why is your answer better than the accepted answer for example? – Jesse Apr 30 '18 at 14:05
3

You can also try this

set.seed(1)
dta <- data.frame("This is Column One" = runif(n = 10),
                 "Another amazing Column name" = runif(n = 10),
                 "!## This Columns is so special€€€" = runif(n = 10),
                check.names = FALSE)

dta <- dta  %>% 
  setNames(gsub("[^[:alnum:] ]", perl = TRUE,
            "",
            names(.))) %>% 
  setNames(gsub("(\\w)(\\w*)",
            "\\U\\1\\L\\2",
            perl = TRUE,
            names(.)))

names(dta)
[1] "This Is Column One"          "Another Amazing Column Name" " This Columns Is So Special"
user3357059
  • 1,122
  • 1
  • 15
  • 30
  • 1
    This should be the accepted answer. The others depend on the dataframe being assigned in the first place in order to then alter colnames. Thanks for this! – Anurag N. Sharma Jun 29 '20 at 08:57
1

Using Stringr and Dplyr, and the dot operator:

dta %>%
   dplyr::rename_all(funs(
                     stringr::str_replace_all( ., "[[:punct:]]", "_" )
   ))
Tony Cronin
  • 1,623
  • 1
  • 24
  • 30