1

I have a large data set with thousands of columns. The column names include various unwanted characters as follows:

col1*
col2*
col3*[Note]

I would like to remove all character strings starting with * and with *[Note] from all column names to be left with clean:

col1 col2 col3 What is the most efficient way to do this for 5000+ columns?

user438383
  • 5,716
  • 8
  • 28
  • 43
  • It's easier to help you if you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. The example doesn't have to have 1000s of columns, just a few to get the point across. – MrFlick Aug 22 '22 at 18:58

2 Answers2

3

We could use sub from base R

names(df1) <- sub("\\*.*", "", names(df1))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

A dplyr solution

library(dplyr)
library(stringr)
df1 %>%
  rename_with(~str_remove(string = ., pattern = "\\*.*"), everything())
Julian
  • 6,586
  • 2
  • 9
  • 33