6

Let's say I have a dataset with very weird names and I want to modify/replace a part of the string of the variable names, and add a logical sequence. The code below works pretty well, since it replace "nameverybig" by "var".

    library(tidyverse)
ds <- data.frame(identification = 1:10,
                 nameverybig_do_you_like_cookies = c(1:10), 
                 nameverybig_have_you_been_in_europe = c(1:10),
                 nameverybig_whats_your_gender = c(1:10))


    ds <- ds %>% 
      rename_all(.,~sub("nameverybig_*", 
                        paste("var"),
                        names(ds)))

But I'm struggling with the process of renaming the string and adding a logical sequence.

ds %>% names
dados <- ds %>% 
  rename_all(.,~sub("nameverybig_*", 
                    paste("var", 1:3),
                    names(ds)))

I would like to stay within the tidyverse framework. I've tried rename_all + contains and matches, and rename_at, but with no success. I based this code on other posts, such as this one and this one This post has a reproducible code. Please let me know if I need to enhance the quality of the question. Thank you.

Luis
  • 1,388
  • 10
  • 30

3 Answers3

8

Update

From dplyr 1.0.0 you can use rename_with.

You can select columns to rename by position

library(dplyr)
ds %>% rename_with(~paste0("var", seq_along(.), sub("nameverybig_*", "_", .)), -1)

Or by name

ds %>% rename_with(~paste0("var", seq_along(.), sub("nameverybig_*", "_", .)), 
                   starts_with('nameverybig'))

Both of which return :

#   identification var1_do_you_like_cookies var2_have_you_been_in_europe var3_whats_your_gender
#1               1                        1                            1                      1
#2               2                        2                            2                      2
#3               3                        3                            3                      3
#4               4                        4                            4                      4
#5               5                        5                            5                      5
#6               6                        6                            6                      6
#7               7                        7                            7                      7
#8               8                        8                            8                      8
#9               9                        9                            9                      9
#10             10                       10                           10                     10

Old Answer

You could use paste0 with sub

ds %>% rename_all(~paste0("var", seq_along(.), sub("nameverybig_*", "_", .)))

To rename only specific variable we can use rename_at

ds %>% rename_at(vars(starts_with("nameverybig")), 
      ~paste0("var", seq_along(.), sub("nameverybig_*", "_", .)))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Curious, how are you able to paste this result here? Like put this table in the answer – NelsonGon Aug 22 '19 at 13:25
  • @NelsonGon Sorry, I didn't get you. Do you mean the names or data of the table? – Ronak Shah Aug 22 '19 at 13:28
  • I just mean the table, copying it from the IDE to SO. I tried but some data goes to the next line. Happens often. – NelsonGon Aug 22 '19 at 13:29
  • Hey, @RonakShah, your script was near perfect, but if replaced all names including a 'var' before. Sorry, my bad! I've updated my original code to make my question clear. Could you please point out how can I change only variables with "nameverybig" preserving all other variables in the dataset? – Luis Aug 22 '19 at 13:37
  • 1
    @NelsonGon I guess you need to adjust the width of console in IDE. – Ronak Shah Aug 22 '19 at 13:40
  • @Luis We can use `rename_at` for that case. Updated the answer. – Ronak Shah Aug 22 '19 at 13:40
  • @RonakShah Thanks, fixed although I prefer smaller widths but this will help me now. Sorry for spamming. I'll delete all comments later. – NelsonGon Aug 22 '19 at 13:43
  • Amazing. Thank you! – Luis Aug 22 '19 at 13:57
6

I find this a bit more concise, and using the tidyverse regex with stringr.

library(dplyr)
library(stringr)

ds %>%
  rename_all( ~ str_replace(., "nameverybig", paste0("var", seq_along(.))))

If the "nameverybig" variables are only a subset, I would combine this with Ronak Shah's answer as so.

  ds %>%
    rename_at(vars(starts_with("nameverybig")), 
              ~ str_replace(., "nameverybig", paste0("var", seq_along(.))))
  • Nice code, thank you. Actually, because the dataset has variables before the "target" variable to rename, the replacement starts with "var2", "var3", etc. I see it happens because of seq_along. Is it possible to change that? – Luis Aug 22 '19 at 13:45
  • 1
    Yeah, I would just basically combine this with the other answer and use `rename_at` with `str_replace`. –  Aug 22 '19 at 13:48
2

An option with setNames:

    ds %>% 
      setNames(nm=paste0("var",1:ncol(.),
                         gsub("nameverybig+",
                       "",
                       names(.))))

Or as suggested by @Adam one can use purrr/rlang's set_names:

ds %>%
  purrr::set_names(~paste0("var",seq_along(.),
                           gsub("nameverybig+",
                                "",.)))

Result:

 var1_do_you_like_cookies var2_have_you_been_in_europe   var3_whats_your_gender
1                         1                            1                      1
2                         2                            2                      2
3                         3                            3                      3
4                         4                            4                      4
5                         5                            5                      5
6                         6                            6                      6
7                         7                            7                      7
8                         8                            8                      8
9                         9                            9                      9
10                       10                           10                     10  
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • 1
    If you do that, also consider `set_names()` from `rlang` (exported to `purrr`). It can support a formula notation and some other goodies. –  Aug 22 '19 at 14:10
  • 1
    Thanks, I have added this option although it's almost the same as other options provided with `rename_all`. – NelsonGon Aug 22 '19 at 14:18
  • Hello, @Adam, learn how to deal with purrr is my next goal. =D – Luis Aug 22 '19 at 16:59