What's a reliable way to get the extension of a file in Java?
There is no reliable way, because there is no reliable way of distinguishing a file suffix from a filename that has dot (period) characters in it.
Or to put it another way, the "real" extension is a construction placed the filename by the human reader. And I think you will find that different people place different constructions. (The real extension for "foo.tar.gz" is either "gz" or "tar.gz", depending on your point of view ... and what the application is designed to do.)
The best you can do is to code your application to use either "stuff after first dot" or "stuff after last dot" as the suffix, depending on what it needs. (And maybe a bit of filtering to distinguish expected extensions from stuff that the application does not understand.)
Then there is the problem that the file extension (however you extract it) is not a reliable indicator of the file's format / meaning. You can attempt to determine the format by using something like Apache Tika. However, even that can be problematic, if the format is not recognized, or (worse) if there are multiple possible formats for a given file.
Returning to the foo.tar.gz
example, as far as I am aware, the only program that relies on the file extension is the gunzip
command which will uncompress foo.tar.gz
as foo.tar
. The tar
command itself is agnostic of the file extension:
- It will read any file as a TAR file, irrespective of the extension.
- If the TAR file is compressed (using gzip compression), then you need to supply the
-z
or --gzip
or equivalent option, irrespective of the extension.
Most UNIX / Linux programs are similarly agnostic of file extensions.