3

Looking for a regex to extract out the filename part excluding the extension from a path in this code

String filename = fullpath.replaceFirst(regex, "$1")

e.g. for starter, here is the most simple case and what I have done:

  • /path/filename.ext -> filename (fullpath.replaceFirst(".*/(.*)\\..*", "$1"))

Here are some more advance cases that I need help with:

  • /filename.ext -> filename (can start with /)
  • filename. -> filename (can end with .)
  • /filename -> filename (can have no .)
  • filename.ext -> filename (can have no /)
  • filename -> filename (can have no . and /)
  • .filename -> .filename (can start with .)
  • /path/.filename -> .filename (can start with . right after /)
  • filename.part1.ext -> filename.part1 (can have middle .)
  • /path_a/path.b/ -> (empty string) (can have no filename)
  • /path_a/path.b/filename -> filename (can have . in path before /)

Edited: There is no actual file here and the fullpath does not lead to any file. It is coming from a URL request.

user1589188
  • 5,316
  • 17
  • 67
  • 130
  • 7
    `new File(fullpath).getName()` – XtremeBaumer Nov 15 '19 at 08:11
  • @XtremeBaumer that failed many cases above unfortunately. – user1589188 Nov 15 '19 at 08:15
  • cause you say that a folder is no file. Once you filter out the folders, it works 100%. But folders are files and have names too tho. Also the extension is usually part of the name – XtremeBaumer Nov 15 '19 at 08:16
  • @XtremeBaumer I am not too sure what you are talking about? And I tried your code, it did not give correct answers. And I did not mention folder anywhere, why are you saying folder? – user1589188 Nov 15 '19 at 08:21
  • 1
    `/path_a/path.b/` is a valid path to a folder instead of a file. Yet it still has a name – XtremeBaumer Nov 15 '19 at 08:25
  • https://stackoverflow.com/a/990408/7109162 + `new File(fullpath).getName()` – XtremeBaumer Nov 15 '19 at 08:25
  • @XtremeBaumer you confuse me a lot here. I said I tried your code, it did not give correct result already, why are you still promoting it? `new File("/path/filename.ext").getName()` gives "filename.ext" not "filename" that I required in the question. – user1589188 Nov 15 '19 at 08:29
  • Possible duplicate of [How do I trim a file extension from a String in Java?](https://stackoverflow.com/questions/941272/how-do-i-trim-a-file-extension-from-a-string-in-java) – XtremeBaumer Nov 15 '19 at 08:35
  • @XtremeBaumer I read the link, but no I am not dealing with actual file here. Its an incoming request in URL that I need to extract the filename part from it. – user1589188 Nov 15 '19 at 08:40

3 Answers3

9

The following regex will match desired parts:

^(?:.*\/)?([^\/]+?|)(?=(?:\.[^\/.]*)?$)

Explanation:

  • ^ Match start of the line
  • (?: Start of a non-capturing group
    • .*\/ Match up to last / character
  • )? End of the non-capturing (optional)
  • ([^\/]+?|) Capture anything but / ungreedily or nothing
  • (?= Start of a positive lookahead
    • (?:\.[^\/.]*)? Match an extension (optional)
    • $ Assert end of the line
  • ) End of the positive lookahead

but if you are dealing with a multi-line input string and need a bit faster regex try this one instead (with m flag on) :

^(?:[^\/\r\n]*\/)*([^\/\r\n]+?|)(?=(?:\.[^\/\r\n.]*)?$)

See live demo here

Filename would be captured in the first capturing group.

revo
  • 47,783
  • 14
  • 74
  • 117
  • Thanks a lot, it seems yours is working in the demo, but I can't put it in my java code, it seems the syntax is a bit different? – user1589188 Nov 15 '19 at 08:38
  • 2
    @user1589188 Yes, where is one backslash you need two. – revo Nov 15 '19 at 08:39
  • Yes I did that. Please help me out here `"/path/filename.ext".replaceFirst("^(?:.*\\/)?([^\\/]+?|)(?=(?:\\.[^\\/.]*)?$)", "$1")` this does not print "filename" but "filename.ext". What am I doing wrong? – user1589188 Nov 15 '19 at 08:46
  • 2
    @user1589188 You shouldn't use a replacement method here but if you have to, you should change positive lookahead to a simple matching cluster: `"/path/filename.ext".replaceFirst("^(?:.*\\/)?([^\\/]+?|)(?:\\.[^\\/.]*)?$", "$1")` – revo Nov 15 '19 at 08:52
  • 2
    @user1589188 This also can be helpful https://regex101.com/r/9FUtDb/1/codegen?language=java – revo Nov 15 '19 at 08:53
  • Thank you, the code in the comment did it. Although I don't understand why the difference, but there you go, points for you! – user1589188 Nov 15 '19 at 08:54
  • 3
    Very impressive ! Great job @revo ! That's working well and your explanation are perfect, thank you. – Gosfly Nov 15 '19 at 09:30
3

You can use the getName() function on a File object and then remove the extension using a Regex and you can check if it's a file too:

File file = new File(fullpath);
if (file.isFile()) return file.getName().replace("\..*", "");
else return "";
SteapStepper69
  • 1,247
  • 10
  • 22
3

Use the path file (String pathFile) to get the name file with extension and remove it with FilenameUtils.removeExtension

String nameDocument = pathFile.substring(pathFile.lastIndexOf("/") + 1);

String fileNameWithOutExt = FilenameUtils.removeExtension(nameDocument);
geisterfurz007
  • 5,292
  • 5
  • 33
  • 54
  • Hey thank you. I guess it is helpful for others. But in my case, it seems "/path/.filename" does not give ".filename" using that util. – user1589188 Nov 15 '19 at 08:56