0

Looks like this question has been asked many times. I cannot find a solution that matches with what I am looking for.

I have a dataset with many columns, atleast 100 columns. Most of the columns repeat with the same column name, the last string varies. Something like this

   Id Date        ABC_XYZ_1  ABC_XYZ_2   ABC_XYZ_3   ABC_XYZ_4..........ABC_XYZ_65    ABC_XYZ_DT ABC_XYZ_BY
   1  10/11/2021  1          1           3           -8       ..........3            12/11/2021   Joe Brown
   2  08/12/2002  4          2           4           -6       ..........4            07/14/2021   Bailey Ron

I am only interested in 65 columns starting with ABC_XYZ_1 ABC_XYZ_2 ABC_XYZ_3...ending in ABC_XYZ_65.

Expected output, a dataset with 65 columns like this

ABC_XYZ_1  ABC_XYZ_2   ABC_XYZ_3   ABC_XYZ_4..........ABC_XYZ_65
1          1           3           -8       ..........3
4          2           4           -6       ..........4 

I know I can use something df[, c(ABC_XYZ_1 , ABC_XYZ_2, ABC_XYZ_3......ABC_XYZ_65)] but I am not interesting in copying the column names ABC_XYZ_1 ABC_XYZ_2 ABC_XYZ_3......ABC_XYZ_65 65 times. Any suggestions for an efficient way to accomplish this is much appreciated. Thanks in advance.

bison2178
  • 747
  • 1
  • 8
  • 22
  • `df %>% select(contains('ABC'))` using `dplyr` – Ronak Shah Oct 03 '21 at 04:39
  • @RonakShah, If i use your solution I will get all the columns including , columns `ABC_XYZ_DT ABC_XYZ_BY`, I dont want that. only columns `ABC_XYZ_1 ABC_XYZ_2 ABC_XYZ_3.....ending in ABC_XYZ_65` – bison2178 Oct 03 '21 at 04:40
  • 1
    @bison2178 You can do `df %>% select(matches("^ABC_XYZ_\\d+$"))` or another option is `df[, paste0("ABC_XYZ_", 1:65)]` – akrun Oct 03 '21 at 04:41
  • @akrun, that worked. If you post this as a solution. I will accept it. – bison2178 Oct 03 '21 at 04:43

0 Answers0