0

I need to be able to extract columns that contain exact string that I am looking for. For example, I have this data frame x:

structure(list(Time = structure(1L, .Label = "1/1/2015", class = "factor"), 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.MB. = 3555L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.Free.MB. = 55L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Swap.Free.MB. = 44L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Cache.Free.MB. = 66L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Swap.And.Cache.Free.MB. = 44L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Percent.Free = 44L, 
    WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Round.Trip.Time = 44L), .Names = c("Time", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.MB.", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.Free.MB.", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Swap.Free.MB.", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Cache.Free.MB.", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Swap.And.Cache.Free.MB.", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Percent.Free", 
"WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Round.Trip.Time"
), class = "data.frame", row.names = c(NA, -1L))

I need to only extract the column that contains this exact match ".Total.Phys.Mem.MB."

When I do this:

x[,grepl(".Total.Phys.Mem.MB.", colnames(x)[2:ncol(x)])]

I dont get the column that contains this string in it ".Total.Phys.Mem.MB.". Is there a better way to extract the columns that contain the string in R?

user1471980
  • 10,127
  • 48
  • 136
  • 235
  • Try `x[, grepl("\\.Total\\.Phys\\.Mem\\.MB\\.", colnames(x))]` – David Arenburg Jan 22 '15 at 21:58
  • @David Arenburg, I also need to extrac the Time column. I tried this: x[grepl("\\.Total\\.Phys\\.Mem\\.MB\\.", colnames(x)[2:ncol(x)])], the same, cannot extract the columns from the data frame. – user1471980 Jan 23 '15 at 15:57

2 Answers2

1
library(dplyr)

select(x, contains(".Total.Phys.Mem.MB."))
  WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.MB.
1                                                                                    3555
DatamineR
  • 10,428
  • 3
  • 25
  • 45
  • can you do or (|) in dply contains. I tried this select(x, contains("Time"|".Total.Phys.Mem.MB.")), got this error: operations are possible only for numeric, logical or complex types – user1471980 Jan 23 '15 at 15:30
1

Unless fixed=TRUE is defined, grepl recognizes the pattern as a regular expression; and in regex the dot is a character of special meaning which must be escaped to match a literal.

> x[grepl("\\.Total\\.Phys\\.Mem\\.MB\\.", colnames(x))]
  WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.MB.
1                                                                                    3555

OR

> x[grepl('.Total.Phys.Mem.MB.', colnames(x), fixed=TRUE)]
  WTAD..Linux..Linux.Percent.of.Physical.Memory.and.Swap.Used.on.web02.Total.Phys.Mem.MB.
1                                                                                    3555
hwnd
  • 69,796
  • 4
  • 95
  • 132