66

I have a string representing an URL containing spaces and want to convert it to an URI object. If I simply try to create it via

String myString = "http://myhost.com/media/File Name that has spaces inside.mp3";
URI myUri = new URI(myString);

it gives me

java.net.URISyntaxException: Illegal character in path at index X

where index X is the position of the first space in the URL string.

How can i parse myString into a URI object?

whlk
  • 15,487
  • 13
  • 66
  • 96

5 Answers5

130

You should in fact URI-encode the "invalid" characters. Since the string actually contains the complete URL, it's hard to properly URI-encode it. You don't know which slashes / should be taken into account and which not. You cannot predict that on a raw String beforehand. The problem really needs to be solved at a higher level. Where does that String come from? Is it hardcoded? Then just change it yourself accordingly. Does it come in as user input? Validate it and show error, let the user solve itself.

At any way, if you can ensure that it are only the spaces in URLs which makes it invalid, then you can also just do a string-by-string replace with %20:

URI uri = new URI(string.replace(" ", "%20"));

Or if you can ensure that it's only the part after the last slash which needs to be URI-encoded, then you can also just do so with help of android.net.Uri utility class:

int pos = string.lastIndexOf('/') + 1;
URI uri = new URI(string.substring(0, pos) + Uri.encode(string.substring(pos)));

Do note that URLEncoder is insuitable for the task as it's designed to encode query string parameter names/values as per application/x-www-form-urlencoded rules (as used in HTML forms). See also Java URL encoding of query string parameters.

Community
  • 1
  • 1
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • 4
    @Mannaz - just be careful when another "invalid" symbol appears in a song name. – Bozho Apr 07 '10 at 19:31
  • @BalusC i tried URLEncoder.encode("query string","UTF-8"); its returning with + symbol like this "query+string" where im expecting "%20". So i used string.replace with the hardcoded the values. Solved the issue. Thanks for the info. Is there any otherway to encode instead of manual replace..? – praveenb Apr 05 '12 at 11:20
18
java.net.URLEncoder.encode(finalPartOfString, "utf-8");

This will URL-encode the string.

finalPartOfString is the part after the last slash - in your case, the name of the song, as it seems.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • 2
    It will also urlencode the colon and the slashes which would make the url still invalid. He basically only need to urlencode the spaces to get it valid. – BalusC Apr 07 '10 at 14:26
  • Ok, this gets me by the `URISyntaxException` but now i get a 404 from the server. The url I get is `http://myhost.com/media/mp3s/9/Agenda+of+swine+-+13.+Persecution+Ascension_+leave+nothing+standing.mp3`. I use the URI in an `org.apache.http.client.methods.HttpGet.HttpGet` Request. Any ideas? – whlk Apr 07 '10 at 14:44
  • @Mannaz now that's another thing - you have to show the servlet code - or better, ask another question. The problem is no longer on the client. – Bozho Apr 07 '10 at 14:47
  • 1
    @Bozho shure it is a client/encoding problem, because requesting the original URL (`myString`) in a normal Browser does not result in a 404 error. – whlk Apr 07 '10 at 14:59
  • @Mannaz and does the resultant (encoded) string result in 404 in a browser? – Bozho Apr 07 '10 at 15:08
  • 4
    I am using java.net.URLEncoder.encode("aa bb cc", "utf-8"); but instead of adding %20 instead of space it replacing +. "aa+bb+cc". Why this is happening. – Sniper Oct 22 '13 at 14:54
  • @Sniper, I`ve got the same problem ('+' instead of '%20') – yuralife Feb 04 '14 at 14:55
  • Found any solution for plus sign ? – Hanry Mar 09 '16 at 18:51
1

To handle spaces, @, and other unsafe characters in arbitrary locations in the url path, Use Uri.Builder in combination with a local instance of URL as I have described here:

private Uri.Builder builder;
public Uri getUriFromUrl(String thisUrl) {
    URL url = new URL(thisUrl);
    builder =  new Uri.Builder()
                            .scheme(url.getProtocol())
                            .authority(url.getAuthority())
                            .appendPath(url.getPath());
    return builder.build();
}
Community
  • 1
  • 1
Phileo99
  • 5,581
  • 2
  • 46
  • 54
1
URL url = Test.class.getResource(args[0]);  // reading demo file path from                                                   
                                            // same location where class                                    
File input=null;
try {
    input = new File(url.toURI());
} catch (URISyntaxException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
azurefrog
  • 10,785
  • 7
  • 42
  • 56
siddmuk2005
  • 178
  • 2
  • 11
  • because this isn't answering the question. – MetaFight Sep 02 '14 at 15:56
  • 1
    I have given this for removing the space from URL so it solve my problem because while reading the file location FileInputStream points to null and while reading with null it will throw Exception bu using URI i didn't get the problem. – siddmuk2005 Sep 04 '14 at 06:07
0

I wrote this function:

public static String encode(@NonNull String uriString) {
    if (TextUtils.isEmpty(uriString)) {
        Assert.fail("Uri string cannot be empty!");
        return uriString;
    }
    // getQueryParameterNames is not exist then cannot iterate on queries
    if (Build.VERSION.SDK_INT < 11) {
        return uriString;
    }

    // Check if uri has valid characters
    // See https://tools.ietf.org/html/rfc3986
    Pattern allowedUrlCharacters = Pattern.compile("([A-Za-z0-9_.~:/?\\#\\[\\]@!$&'()*+,;" +
            "=-]|%[0-9a-fA-F]{2})+");
    Matcher matcher = allowedUrlCharacters.matcher(uriString);
    String validUri = null;
    if (matcher.find()) {
        validUri = matcher.group();
    }
    if (TextUtils.isEmpty(validUri) || uriString.length() == validUri.length()) {
        return uriString;
    }

    // The uriString is not encoded. Then recreate the uri and encode it this time
    Uri uri = Uri.parse(uriString);
    Uri.Builder uriBuilder = new Uri.Builder()
            .scheme(uri.getScheme())
            .authority(uri.getAuthority());
    for (String path : uri.getPathSegments()) {
        uriBuilder.appendPath(path);
    }
    for (String key : uri.getQueryParameterNames()) {
        uriBuilder.appendQueryParameter(key, uri.getQueryParameter(key));
    }
    String correctUrl = uriBuilder.build().toString();
    return correctUrl;
}
hadilq
  • 1,023
  • 11
  • 25