0

I have a series of excel files and I have been using this basic code to import my data for a very long time now. I have not made any changed to the data or the code, but I not read the data properly anymore. I read the files as follow:

apply_fun <- function(filename) {
  data <- readxl::read_excel(filename, sheet = "Section Cut Forces - Design", 
                       skip = 1, col_names = TRUE) 
  data <- data[-1,] 
  data <- subset(data, StepType == "Max")
  data <- data[,c(1,2,6,10)]
  data$id <- filename
  return(data)
}

filenames <- list.files(pattern = "\\.xlsx", full.names = TRUE)

first <- lapply(filenames,apply_fun)
out <- do.call(rbind,first)

The first few rows of out look like:

structure(list(SectionCut = c("1", "1", "1", "1", "1", "2", "2", 
"2", "2", "2"), OutputCase = c("Service (after losses)", "LL-1", 
"LL-2", "LL-3", "LL-4", "Service (after losses)", "LL-1", "LL-2", 
"LL-3", "LL-4"), V2 = c("11.522", "28.587", "42.246000000000002", 
"44.212000000000003", "36.183", "9.8469999999999995", "23.989000000000001", 
"37.408999999999999", "43.401000000000003", "40.450000000000003"
), M3 = c("299728.66100000002", "42863.517999999996", "63147.332999999999", 
"69628.464000000007", "59196.74", "0", "27.942", "44.863999999999997", 
"46.31", "36.204999999999998"), id = c("./100-12-S00.xlsx", "./100-12-S00.xlsx", 
"./100-12-S00.xlsx", "./100-12-S00.xlsx", "./100-12-S00.xlsx", 
"./100-12-S00.xlsx", "./100-12-S00.xlsx", "./100-12-S00.xlsx", 
"./100-12-S00.xlsx", "./100-12-S00.xlsx")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

I try to remove rows as:

out2 <- out[!grep("Service (after losses)", out$OutputCase),]

but the result is 0 observations.

I must say that this just started being an issue for me. I have been able to run this code successfully for months now and never had an issue.

Maral Dorri
  • 468
  • 5
  • 17

1 Answers1

1

() are special symbols in regex. They have special meaning when you use them in functions like grep/grepl etc. You can use fixed = TRUE in grep to match them exactly. Also ! should be used with grepl and - should be used with grep to remove rows.

out[-grep("Service (after losses)", out$OutputCase, fixed = TRUE),]

Apart from that this looks like an exact match so why use pattern matching with grep? Try :

out[out$OutputCase != 'Service (after losses)', ]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213