17

Is there a standard/good way of converting between urls and windows filenames, in Java?

I am trying to download files, but I want the windows filename to be convertible back to the original filename. Note that the query portion of the url is vital, as I will be downloading different pages that differ only in query.

My current hacky solution is to replace illegal characters (such as '?') with a specific string (such as 'QQ'), but this makes conversion back to url less transparent. Is there a better way?

Paul
  • 549
  • 2
  • 4
  • 13
  • 1
    Some examples of what you're trying to accomplish would be very helpful. – Jim Garrison Oct 30 '09 at 21:49
  • 1
    I think he's talking about saving off the results of a web request by using the web URL as the filename. This runs into problems with characters such as '*' and '?', which are valid in a URL but invalid as part of a Windows file name. – James Van Huis Oct 30 '09 at 21:58
  • 1
    i.e. www.google.com/search?q=bad+urls, which would not be a valid windows filename (due to the question mark). – James Van Huis Oct 30 '09 at 22:00

4 Answers4

24

You could do worse than use URLEncoder to encode the URL:

String url = "http://172.0.0.1:80/foo/bar/baz.txt?black=white";
String filename = URLEncoder.encode(url, "UTF-8");
File file = new File(filename);

The filename becomes the legal win32 name:

http%3A%2F%2F172.0.0.1%3A80%2Ffoo%2Fbar%2Fbaz.txt%3Fblack%3Dwhite

This is a reversible operation:

String original = URLDecoder.decode(filename, "UTF-8");
McDowell
  • 107,573
  • 31
  • 204
  • 267
2

The java.io.File class takes a URI &| filename as a constructor, but contains toURI() & toURL() methods as well as getName() & getPath(). I assume this would be a valid conversion for you?

Jé Queue
  • 10,359
  • 13
  • 53
  • 61
2

But is it possible to encode url to filename at all? I mean, can there be the 100% valid solution? I think that converting url to filename is the wrong idea in general, because of different limitations set on urls and filenames:

Max filename length (NTFS filesystem, Unicode, using UTF-16 encoding) - 255

Max URL length (using UTF-8 encoding?) - 2000 chars

Community
  • 1
  • 1
Paulius
  • 79
  • 1
  • 7
0

If you mean to convert an URL encoded to non encoder you could use:

URLDecoder

Utility class for HTML form decoding. This class contains static methods for decoding a String from the application/x-www-form-urlencoded MIME format.

See if that's what you need.

OscarRyz
  • 196,001
  • 113
  • 385
  • 569