0

I have a set of 'filename.extension's and I want to extract just the filename. I am having trouble extracting the full filename when the filename shares a character with the file extension. for example, the filename.extension "qrs.sas7bdat" has

    filename="qrs"
    extension="sas7bdat"

In this case one may observe that the filename shares in common with the extension the character "s".

Here's some R code to give more context:


files_sas <- c("abc.sas7bdat","qrs.sas7bdat")
stringr::str_extract(files_sas,"(?:.*|.*s)[^\\.sas7bdat]")

This set of code returns the following character vector:

"abc" "qr" 

This is not what I want -- the desired result I want follows:

c("abc","qrs")

It looks like I'm close, and so I am hoping someone might be able to help me get my desired result.

Many thanks.

HumanityFirst
  • 305
  • 1
  • 8

1 Answers1

0

We can use sub to match the . (. is a metacharacter that matches any character, so we escape (\\) iit, followed by other character (.*), in the replacement, we can specify blank ("")

sub("\\..*", "", files_sas)
#[1] "abc" "qrs"

Or with stringr

library(stringr)
str_remove(files_sas, "\\..*")

Or with file_path_sans_ext

tools::file_path_sans_ext(files_sas)
#[1] "abc" "qrs"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    The first two variants can not handle filenames with dots. Those are widely allowed. Therefore I would only recommend your third variant, the regex inside of the source code of `file_path_sans_ext` is more complicated than `"\\..*"` for this very reason. For completeness: `sub("([^.]+)\\.[[:alnum:]]+$", "\\1", x)` – zerweck Jun 19 '20 at 18:22