Changing the column name based on a partial string or substring

Question

I have a data frame df. I can generate this data frame 5 times for 5 different variables. Let's say variables names are:

Apple  # apple_df
Mango  # mango_df
Banana # banana_df
Potato # potato_df
Tomato # tomato_df

Each time the data frame is generated one of the column names is quite large such as:

Apple - Growth Level Judgement    # Column name for apple_df
Mango - Growth Level Judgement    # Column name for mango_df
Banana - Growth Level Judgement   # Column name for banana_df
Potato - Growth Level Judgement   # Column name for potato_df
Tomato - Growth Level Judgement   # Column name for tomato_df

I want to change the above column names to just the word Growth across each of the files.

Is there a way to do it effectively across all data frames by using one common line of code (separately)?

I can use the complete name in each of the files separately but was wondering if we could have a generalised solution:

# For Apple data frame

# Update column name
setnames(apple_df, 
         old = c('Apple - Growth Level Judgement'), 
         new = c('Growth'))

If I use the following regex-based solution, it only replaces the part of the string name that is common across all data frames. Unfortunately, not the whole name.

gsub(x = names(apple_df), 
     pattern = "Growth Level Judgement$", replacement = "Growth")

The following post is related but it strips the known part of the string Remove part of column name. In my case, I want to detect the occurrence of a column based on a partial string that stays the same across multiple datasets. But once the string is detected in the column name, I want to change the whole column name. The following posts may also be related but do not meet my needs r Remove parts of column name after certain characters or Rename column names according to pattern matching R

Any advice on this would be greatly appreciated. Thanks!

akrun · Answer 1 · 2021-10-26T03:44:46.417

2

Use endsWith from base R

names(Apple)[endsWith(names(Apple), 'Growth Level Judgement')] <- 'Growth'

Based on the documentation ?endsWith, it could be faster

startsWith() is equivalent to but much faster than

substring(x, 1, nchar(prefix)) == prefix
or also

grepl("^", x)

edited Oct 26 '21 at 03:44

answered Oct 26 '21 at 03:38

akrun

874,273
37
540
662

Ronak Shah · Accepted Answer · 2021-10-26T02:20:44.727

1

Put the dataframes in a list and use lapply/map to change name of every dataframe. list2env to transfer those changes from the list to individual dataframes.

library(dplyr)
library(purrr)

list_df <- lst(Apple, Mango, Banana, Potato, Tomato)

list_df <- map(list_df, 
             ~.x %>% rename_with(~'Growth', matches('Growth Level Judgement')))

list2env(list_df, .GlobalEnv)

To run it on single dataframe you can do -

Apple %>% rename_with(~'Growth', matches('Growth Level Judgement')))

Or in base R -

names(Apple)[grep('Growth Level Judgement', names(Apple))] <- 'Growth'

edited Oct 26 '21 at 02:20

answered Oct 26 '21 at 02:09

Ronak Shah

377,200
20
156
213

Thank you for posting this solution. Perhaps I could not clarify in my post. I want to run one generalised statement separately across all the data frames. I do not want to list or run it across all the data frames together. – Sandy Oct 26 '21 at 02:14
1

Did you try running `Apple <- Apple %>% rename_with(~'Growth', matches('Growth Level Judgement')))` ? – Ronak Shah Oct 26 '21 at 02:16
Can multiple changes be made within a single ```rename_with()``` command? – Sandy Oct 26 '21 at 02:25
1

Do you mean like this ? `mtcars %>% rename_with(~c('A', 'B'), matches('mpg|cyl')) %>% head` – Ronak Shah Oct 26 '21 at 02:30
It did not work in my original data, I get the following error: ``` Error: Names must be unique. x These names are duplicated: * "Overall Indicator Level Judgement" at locations 54, 56, 58, and 60. * "10 Learning Community Participation" at locations 55, 57, 59, and 61. ``` – Sandy Oct 26 '21 at 02:42
1

Well without knowing details of your original data I don't think I'll be able to debug this further. In the previous comment I have shown how it works on `mtcars` dataset. If you are unable to resolve this feel free to ask a new question with details of your original data. – Ronak Shah Oct 26 '21 at 02:48

score 1 · Answer 3 · answered Oct 26 '21 at 02:36

1

An alternate solution could be:

Apple %>% 
      rename_with(~'Growth', ends_with('Growth Level Judgement'))

answered Oct 26 '21 at 02:36

Sandy

1,100
10
18

Changing the column name based on a partial string or substring

Related posts:

3 Answers3