6

Is there a way "dplyr way" to rename a subset of variables from a data.frame based on variables name translator (vt), a data.frame containing columns with the old and new variable names (old_varname and new_varname respectively). For example:

d <- iris
vt <- data.frame(old_varname=c('Sepal.Length','Petal.Length'),
                  new_varname=c('a','b'))
d <- d %>% rename_( .... )
#In base R code, this would be:
names(d)[names(d) %in% vt$old_varname] <- vt$std_varname

Edit: Further clarification:

  • Assume the vector of variables to be translated is very long, so writing the old-new name pairs by hand is not viable
  • The variables to be renamed are a subset of total variables, I still want to keep all variables
LucasMation
  • 2,408
  • 2
  • 22
  • 45
  • 1
    I think you're better off with the base-R solution for this case. What doesn't work about it for you? – Shorpy Jun 02 '16 at 17:14
  • @Shorpy: the problem is that I have data.table loaded for the heavy data manipulation. Data.table changes the sintax of the subsetting "[ ]" and breaks some codes. So I was trying to have o the metadata related code in dplyr. – LucasMation Jun 02 '16 at 18:31
  • Possible dupe? [Rename multiple columns by names](https://stackoverflow.com/questions/20987295/rename-multiple-columns-by-names) – tjebo Jan 02 '20 at 16:46

4 Answers4

6

Try this:

d <- iris
vt <- data.frame(old_varname=c('Sepal.Length','Petal.Length'),
              new_varname=c('a','b'), stringsAsFactors = F)
d.out <- d %>% rename_(.dots = setNames(vt$old_varname, vt$new_varname))

head(d.out)
    a Sepal.Width   b Petal.Width Species
1 5.1         3.5 1.4         0.2  setosa
2 4.9         3.0 1.4         0.2  setosa
3 4.7         3.2 1.3         0.2  setosa
4 4.6         3.1 1.5         0.2  setosa
5 5.0         3.6 1.4         0.2  setosa
6 5.4         3.9 1.7         0.4  setosa

Please note that the first argument to setNames cannot accept factor, so I modify the vt by adding stringsAsFactors = F.

zyurnaidi
  • 2,143
  • 13
  • 14
  • This looks great but I get an error: # Error: All arguments to rename must be named. – LucasMation Jun 02 '16 at 20:05
  • @LucasMation Hm, I'm not sure what's wrong. But I updated the answer to show the full process on my side – zyurnaidi Jun 02 '16 at 21:15
  • rename_ is deprecated by now. Jeff Hammerbacher's answer with rename() and !!set_names() does work at this moment in time, if you want to use tidyverse (dplyr and purrr in this case) – 4rj4n Dec 03 '19 at 11:17
3

Another way to solve this problem is to unquote a named vector with the new column names as the names and the old column names as the values. Note that I'm using purrr::set_names to make the named vector.

library(tidyverse)
d <- as_tibble(iris)
vt <- tibble(old_varname=c('Sepal.Length','Petal.Length'),
             new_varname=c('a','b'))
d_new_names <- d %>% rename(!!set_names(vt$old_varname, vt$new_varname))
head(d_new_names)
Jeff Hammerbacher
  • 4,226
  • 2
  • 29
  • 36
2

Lucas,

Thanks for the clarification:

You can use data.table::setnames(). Hope this helps.

if (!require(data.table)) install.packages(data.table)
data(iris)
d <- iris
head(d)
old_varname=c('Sepal.Length','Petal.Length')
new_varname=c('a','b')
d2 <- d %>% data.table::setnames(old = old_varname, new = new_varname)
head(d2)

output:

> if (!require(data.table)) install.packages(data.table)
> data(iris)
> d <- iris
> head(d)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> old_varname=c('Sepal.Length','Petal.Length')
> new_varname=c('a','b')
> d2 <- d %>% data.table::setnames(old = old_varname, new = new_varname)
> head(d2)
    a Sepal.Width   b Petal.Width Species
1 5.1         3.5 1.4         0.2  setosa
2 4.9         3.0 1.4         0.2  setosa
3 4.7         3.2 1.3         0.2  setosa
4 4.6         3.1 1.5         0.2  setosa
5 5.0         3.6 1.4         0.2  setosa
6 5.4         3.9 1.7         0.4  setosa
Technophobe01
  • 8,212
  • 3
  • 32
  • 59
  • Sorry, my post was unclear (I'll edit now). Assume the vector of variables to be translated is very long, so writing the old - new name pairs by hand (as in your answer) is not viable. Tks. – LucasMation Jun 02 '16 at 17:02
  • @LucasMation - In the updated question context my preference would be data.table::setnames(). See updated example. – Technophobe01 Jun 02 '16 at 19:32
1

Use setNames()

    > df <- letters[1:5]
    > iris %>% setNames(df)

If you know the indices of the columns you want to rename in the original data, you could do that using setnames()from data.table package

   > df <- letters[24:26]
   > df
   [1] "x" "y" "z"
   > setnames(iris,names(iris)[c(1,2,5)],df)
   > head(iris)
       x   y Petal.Length Petal.Width      z
   1 5.1 3.5          1.4         0.2 setosa
   2 4.9 3.0          1.4         0.2 setosa
   3 4.7 3.2          1.3         0.2 setosa
   4 4.6 3.1          1.5         0.2 setosa
   5 5.0 3.6          1.4         0.2 setosa
   6 5.4 3.9          1.7         0.4 setosa
Sowmya S. Manian
  • 3,723
  • 3
  • 18
  • 30
  • 2
    Tks. That seems the right direction. However, the variables contained in the translation dictionary are just a subset of the total variables, and I want to keep them all (see my edit above) – LucasMation Jun 02 '16 at 17:10
  • You are saying Let's say if you have 5 columns and you want to rename any 3 of them and the rest two names would be original. Is that correct? – Sowmya S. Manian Jun 02 '16 at 17:18
  • yes! But you can´t assume the order of the variables is the same in the dataset and the dictionary – LucasMation Jun 02 '16 at 18:27