2

I am having trouble encoding a url with combined Non-ASCII and spaces. For example, http://xxx.xx.xx.xx/resources/upload/pdf/APPLE ははは.pdf. I've read here that you need to encode only the last part of the path of the url.

Here's the code:

public static String getLastPathFromUrl(String url) {
    return url.replaceFirst(".*/([^/?]+).*", "$1");
}

So now I have already APPLE ははは.pdf, next step is to replace spaces with %20 for the link to work BUT the problem is that if I encode APPLE%20ははは.pdf it becomes APPLE%2520%E3%81%AF%E3%81%AF%E3%81%AF.pdf. I should have APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf.

So I decided to:

1. Separate each word from the link
2. Encode it
3. Concatenate the new encoded words, for example:
    3.A. APPLE (APPLE)
    3.B. %E3%81%AF%E3%81%AF%E3%81%AF.pdf (ははは.pdf)
    with the (space) converted to %20, now becomes APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf

Here's my code:

public static String[] splitWords(String sentence) {
    String[] words = sentence.split(" ");
    return words;
}

The calling code:

String urlLastPath = getLastPathFromUrl(pdfUrl);
String[] splitWords = splitWords(urlLastPath);
for (String word : splitWords) {
    String urlEncoded = URLEncoder.encode(word, "utf-8"); //STUCKED HERE
}

I now want to concatenate each unicoded string(urlEncoded) inside the indices to finally form like APPLE%20%E3%81%AF%E3%81%AF%E3%81%AF.pdf. How do I do this?

Community
  • 1
  • 1
Compaq LE2202x
  • 2,030
  • 9
  • 45
  • 62
  • I don't know if I understood the question well. At the end do you want to just replace the last path from the URL with your encoded string? – nikmin Feb 04 '14 at 09:16
  • @nikmin You're right, and after that I want to concatenate the encoded words to have my desired URL. – Compaq LE2202x Feb 04 '14 at 09:21

4 Answers4

1

actually the %20 is encoded as %2520 so just call URLEncoder.encode(word, "utf-8"); so you will get result like this APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf and in final result replace + with %20.

Maulik.J
  • 636
  • 1
  • 6
  • 14
  • I've never felt so careless, thank you very much you saved my ass. I had separated already the last path of my url, I just need to encode it as a whole and replaced the `+` with `%20`. – Compaq LE2202x Feb 05 '14 at 02:22
1

Do you want to do something like this:

// Get the whole url as string
Stirng urlString = pdfUrl.toString();

// get the string before the last path segment
String result = urlString.substring(0, urlString.lastIndexOf("/"));

String urlLastPath = getLastPathFromUrl(pdfUrl);
String[] splitWords = splitWords(urlLastPath);

for (String word : splitWords) {
    String urlEncoded = URLEncoder.encode(word, "utf-8");

    // add the encoded part to the url
    result += urlEncoded;
}

Now the string result is your encoded URL as a string.

nikmin
  • 1,803
  • 3
  • 28
  • 46
1

Possibly easy with org.apache.commons.io.FilenameUtils.

  1. Split your url into baseUrl and the file name and extension.
  2. Encode the file name and extension
  3. Join them together

String url = "http://xxx.xx.xx.xx/resources/upload/pdf/APPLE ははは.pdf";

String baseUrl = FilenameUtils.getPath(url); // GIVES: http://xxx.xx.xx.xx/resources/upload/pdf/
String myFile = FilenameUtils.getBaseName(url)
            + "." + FilenameUtils.getExtension(url); // GIVES: APPLE ははは.pdf
String encoded = URLEncoder.encode(myFile, "UTF-8"); //GIVES: APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf
System.out.println(baseUrl + encoded);

Output:

http://xxx.xx.xx.xx/resources/upload/pdf/APPLE+%E3%81%AF%E3%81%AF%E3%81%AF.pdf
StoopidDonut
  • 8,547
  • 2
  • 33
  • 51
  • I already did this in a different way, but thanks though. My mistake was I replaced first all `(whitespace)` to `%20` then encode, when I just had to encode first then replace `+` to `%20`. – Compaq LE2202x Feb 05 '14 at 02:26
  • What if my `baseUrl` is only `http://xxx.xx.xx.xx/` and `fileUrl` is `resources/upload/pdf/APPLE ははは.pdf`, I can now get the last path which is `APPLE ははは.pdf` but how do I get the remaining path which is `resources/upload/pdf/`? – Compaq LE2202x Feb 10 '14 at 10:58
  • 1
    @CompaqLE2202x Do I get a bounty? Just kidding, you can use `URL` class for it with minor tweaks: `String str = "http://xxx.xx.xx.xx/resources/upload/pdf/APPLE.pdf"; URL url = new URL(str); System.out.println(url.getProtocol() + "://" + url.getHost()); System.out.println(url.getPath());` – StoopidDonut Feb 11 '14 at 07:56
0

Don't reinvent the wheel. Use URLEncoder for encoding the URL.

URLEncoder.encode(yourArgumentsHere, "utf-8");

Moreover, where do you get your URL from, so that you have to split it before encoding? You should first build the arguments (last part), then just append it onto the base URL.

FD_
  • 12,947
  • 4
  • 35
  • 62
  • From this [link](http://stackoverflow.com/a/3286128/1968739), I can only encode the last part of the URL since using `URLEncoder.encode(yourArgumentsHere, "utf-8");` for the whole URL would transform `http://` to `http%3A%2F%2F`. And my resource URL is coming from my server, basing from the file name so if I have `APPLE ははは.pdf` the resource URL would be `http://xxx.xx.xx.xx/resources/upload/pdf/APPLE ははは.pdf` which has spaces and Japanese characters. – Compaq LE2202x Feb 04 '14 at 09:19
  • Modify your server to only return the file name, then URLEncode it, then build the full URL. – FD_ Feb 04 '14 at 09:23
  • That's the problem, that was already the design. I could only adapt to what's provided for me. – Compaq LE2202x Feb 05 '14 at 02:27