97

Is there a better way to get file basename and extension than something like

File f = ...
String name = f.getName();
int dot = name.lastIndexOf('.');
String base = (dot == -1) ? name : name.substring(0, dot);
String extension = (dot == -1) ? "" : name.substring(dot+1);
Jason S
  • 184,598
  • 164
  • 608
  • 970
  • 8
    Take a look at [commons-io](http://commons.apache.org/io/) [`FilenameUtils`](http://commons.apache.org/io/api-2.0/org/apache/commons/io/FilenameUtils.html). It has the `getBaseName(..)` and `getExtension(..)` methods. – Bozho Dec 28 '10 at 12:20
  • For *only* the extension, see https://stackoverflow.com/questions/3571223/how-do-i-get-the-file-extension-of-a-file-in-java . – Andy Thomas Nov 05 '20 at 23:34

8 Answers8

179

I know others have mentioned String.split, but here is a variant that only yields two tokens (the base and the extension):

String[] tokens = fileName.split("\\.(?=[^\\.]+$)");

For example:

"test.cool.awesome.txt".split("\\.(?=[^\\.]+$)");

Yields:

["test.cool.awesome", "txt"]

The regular expression tells Java to split on any period that is followed by any number of non-periods, followed by the end of input. There is only one period that matches this definition (namely, the last period).

Technically Regexically speaking, this technique is called zero-width positive lookahead.


BTW, if you want to split a path and get the full filename including but not limited to the dot extension, using a path with forward slashes,

    String[] tokens = dir.split(".+?/(?=[^/]+$)");

For example:

    String dir = "/foo/bar/bam/boozled"; 
    String[] tokens = dir.split(".+?/(?=[^/]+$)");
    // [ "/foo/bar/bam/" "boozled" ] 
Community
  • 1
  • 1
Adam Paynter
  • 46,244
  • 33
  • 149
  • 164
  • Good simple solution without any extra dependencies! Just note that it does not handle it like you do: `String fileName = "noextension"` yields `["noextension"]` and not `["noextension", ""]` – dacwe Dec 28 '10 at 15:00
  • 3
    I have no idea why people are afraid of dependencies ;-) – Bozho Dec 28 '10 at 16:00
  • @dacwe: Fair enough. I wonder if there is a regular expression that can... – Adam Paynter Dec 28 '10 at 16:36
  • 3
    @Bozho: I agree that libraries are better solutions for this type of problem. It lets other people do the maintaining and thinking for you (that's why I up-voted your answer!). This may sound trivial, but there is part of me that always hesitates when I consider including an Apache library because I have suffered "JAR hell" in the past with some of their stuff (I know, it's trivial). – Adam Paynter Dec 28 '10 at 16:39
  • 4
    @Bozho: Adam's 100% right. This issue wouldn't be enough to warrant me taking on yet another library -- but if I were already using commons-io for other reasons, then i'd use Filenameutils. – Jason S Jan 01 '11 at 00:10
  • @Jason S - I include -lang and -io by default. And I always use them. They do nothing bad but provide very helpful methods. Maybe some that you don't even suspect. (For example `IOUtils.copy(..)` is one of my favourite). :) – Bozho Jan 01 '11 at 01:00
  • @Bozho: Fair enough. I suppose I don't miss that functionality because I usually include Guava. – Adam Paynter Jan 01 '11 at 09:59
  • Well, yes, guava is often an alternative – Bozho Jan 01 '11 at 10:30
  • 1
    @Jason: Regular expressions: the gift that keeps on giving. :) – Adam Paynter Jan 20 '11 at 19:08
  • 1
    The right thing to use would be zero-width negative lookahead: `"\\.(?!.*\\.)"` – m0she Nov 13 '12 at 11:13
  • 1
    **Fails** for `archive.tar.gz`, the correct extension is `.tar.gz`... – Has QUIT--Anony-Mousse Jun 09 '15 at 21:40
  • 1
    It doesn’t "fail" because there’s no right answer. Python’s `splitext` also splits on the first dot. – bfontaine Oct 27 '17 at 14:51
  • 6
    @Bozho - Sarcasm? The real question is why java comes with endless piles of redundant classes that come so close to making it easy to do what you actually want to do, but then frustratingly never actually do it. There's no equivalent to Apache-Commons in Python because Python simply has all the useful stuff you want built-in already. C# seems to be another example of a language where you can focus on your unique problem instead of having to figure out how to reinvent the wheel or go get the wheel that someone else invented. – ArtOfWarfare May 17 '18 at 19:14
97

Old question but I usually use this solution:

import org.apache.commons.io.FilenameUtils;

String fileName = "/abc/defg/file.txt";

String basename = FilenameUtils.getBaseName(fileName);
String extension = FilenameUtils.getExtension(fileName);
System.out.println(basename); // file
System.out.println(extension); // txt (NOT ".txt" !)
Ilya Serbis
  • 21,149
  • 6
  • 87
  • 74
Oibaf it
  • 1,842
  • 16
  • 9
  • Doesn't works if working in windows and String "fileName" is "D:\resources\ftp_upload.csv" Can you please help out? – NIKHIL CHAURASIA Jun 16 '16 at 10:10
  • 3
    @NIKHILCHAURASIA you need to escape the backslashes, by doubling them. Like: "D:\\resources\\ftp_upload.csv". – Ricket Oct 28 '16 at 00:21
8

http://docs.oracle.com/javase/6/docs/api/java/io/File.html#getName()

From http://www.xinotes.org/notes/note/774/ :

Java has built-in functions to get the basename and dirname for a given file path, but the function names are not so self-apparent.

import java.io.File;

public class JavaFileDirNameBaseName {
    public static void main(String[] args) {
    File theFile = new File("../foo/bar/baz.txt");
    System.out.println("Dirname: " + theFile.getParent());
    System.out.println("Basename: " + theFile.getName());
    }
}
8

Source: http://www.java2s.com/Code/Java/File-Input-Output/Getextensionpathandfilename.htm

such an utility class :

class Filename {
  private String fullPath;
  private char pathSeparator, extensionSeparator;

  public Filename(String str, char sep, char ext) {
    fullPath = str;
    pathSeparator = sep;
    extensionSeparator = ext;
  }

  public String extension() {
    int dot = fullPath.lastIndexOf(extensionSeparator);
    return fullPath.substring(dot + 1);
  }

  public String filename() { // gets filename without extension
    int dot = fullPath.lastIndexOf(extensionSeparator);
    int sep = fullPath.lastIndexOf(pathSeparator);
    return fullPath.substring(sep + 1, dot);
  }

  public String path() {
    int sep = fullPath.lastIndexOf(pathSeparator);
    return fullPath.substring(0, sep);
  }
}

usage:

public class FilenameDemo {
  public static void main(String[] args) {
    final String FPATH = "/home/mem/index.html";
    Filename myHomePage = new Filename(FPATH, '/', '.');
    System.out.println("Extension = " + myHomePage.extension());
    System.out.println("Filename = " + myHomePage.filename());
    System.out.println("Path = " + myHomePage.path());
  }
}
Erhan Bagdemir
  • 5,231
  • 6
  • 34
  • 40
  • 4
    `basename()` would be a better name instead of `filename()` – nimcap Apr 10 '12 at 12:21
  • in case there's no extension (e.g. file name like "/etc/hosts") this will return "hosts" as the extension (rather than ""). library-grade utility classes should take care of corner cases. – Zach-M Nov 30 '16 at 06:52
2

File extensions are a broken concept

And there exists no reliable function for it. Consider for example this filename:

archive.tar.gz

What is the extension? DOS users would have preferred the name archive.tgz. Sometimes you see stupid Windows applications that first decompress the file (yielding a .tar file), then you have to open it again to see the archive contents.

In this case, a more reasonable notion of file extension would have been .tar.gz. There are also .tar.bz2, .tar.xz, .tar.lz and .tar.lzma file "extensions" in use. But how would you decide, whether to split at the last dot, or the second-to-last dot?

Use mime-types instead.

The Java 7 function Files.probeContentType will likely be much more reliable to detect file types than trusting the file extension. Pretty much all the Unix/Linux world as well as your Webbrowser and Smartphone already does it this way.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • 12
    How does this answer the question? Neither `File` nor `Path` let me split off the extension. – Andreas Abel Nov 23 '17 at 16:19
  • 1
    @andreas.abel let me repeat this: File extensions are a broken concept. They are not reliable, nor well-defined except on DOS 8+3 file names (consider `.tar.gz` vs. `.tgz` all too common on unix). **Use mime types instead.** – Has QUIT--Anony-Mousse Apr 26 '18 at 15:12
  • 4
    @Anony-Mousse Well, I agree in principle but 99,999% of all systems I interact with use a file name, not a mime type – Christian Sauer Jul 26 '19 at 05:37
  • 1
    Where is the problem in using `Files.probeContentType` instead of relying on the file name to have the right extension? – Has QUIT--Anony-Mousse Jul 26 '19 at 06:08
  • 8
    This doesn't answer the question. I have a use-case where the file-name, a movie, is a name + extension. How would I extract the name by using mime-types? – Niek Aug 19 '19 at 18:12
  • Unfortunately Has QUIT--Anony-Mousse's comment does not help me. I came to SO to see whether java has something simple like ruby has via File.dirname() and File.basename(). To then yak shave about in regards to file extensions being bad and people should use mime types, does not help us. :( (I don't quite want to use regular expressions either though... guess I'll go with the library approach instead.) – shevy Mar 10 '21 at 19:31
  • The file extension is the one from the last dot, .gz in this case. So, they aren't broken, you just need to know what they are. The problem with stupid Windows applications is the Windows world, that's why I use *Nix systems and I use the pipe operator from the command line. You don't have any mime if the only thing that was passed to you is a file name. – zakmck Jun 03 '23 at 00:53
2

What's wrong with your code? Wrapped in a neat utility method it's fine.

What's more important is what to use as separator — the first or last dot. The first is bad for file names like "setup-2.5.1.exe", the last is bad for file names with multiple extensions like "mybundle.tar.gz".

Mot
  • 28,248
  • 23
  • 84
  • 121
-1

You can also user java Regular Expression. String.split() also uses the expression internally. Refer http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html

Amit
  • 281
  • 1
  • 6
  • 16
-4

Maybe you could use String#split

To answer your comment:

I'm not sure if there can be more than one . in a filename, but whatever, even if there are more dots you can use the split. Consider e.g. that:

String input = "boo.and.foo";

String[] result = input.split(".");

This will return an array containing:

{ "boo", "and", "foo" }

So you will know that the last index in the array is the extension and all others are the base.