30

I'm using compact framework/C# on windows mobile.

In my application I am uploading data to the server by serializing objects and using a HttpWebRequest/POST request to send the information up. On the server the post data is de-serialised and saved to the db.

The other day I realised that I had a problem with special characters in the post data (ampersands etc..). So I introduced Uri.EscapeDataString() into the method and all was well.

However, today I have discovered that there is a problem when the application attempts to upload a large amount of data (I'm unsure of what exactly denotes "large" at the moment!)

Existing code (Kind of)

var uploadData = new List<Things>();

uploadData.Add(new Thing() { Name = "Test 01" });
uploadData.Add(new Thing() { Name = "Test 02" });
uploadData.Add(new Thing() { Name = "Test with an & Ampersand " }); // Do this a lot!!

var postData = "uploadData=" + Uri.EscapeDataString(JsonConvert.SerializeObject(uploadData, new IsoDateTimeConverter()));

Problem

The call to Uri.EscapeDataString() is causing the following exception:

System.UriFormatException: Invalid URI: The Uri string is too long.

Question

Are there any other ways to prepare the data for upload?

As far as I can see HttpUtility (which has its own Encode/Decode methods) is not available for the compact framework.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
ETFairfax
  • 3,794
  • 8
  • 40
  • 58
  • 1
    You could write you're own implementation? `EscapeDataString()` seems mostly convinience... do a normal `String.Replace` based on a library of characters that need to be escaped? – Smudge202 Jul 15 '11 at 08:54
  • Msdn states : UriFormatException - The length of stringToEscape exceeds 32766 characters. – fluent Jul 22 '11 at 07:37
  • As Smudge202 suggested, i simply wrote my own implementation. – ETFairfax Aug 25 '11 at 11:59
  • 1
    How about posting this implementation? – Oleg Grishko Jan 30 '12 at 13:06
  • I would have posted the implementation but it was a bit scabby!! I've recently changed to use the accepted answer. – ETFairfax Mar 07 '14 at 12:45
  • Since .NET Framework 4.5 and .NET Standard 1.0 you should use WebUtility.UrlEncode. See [this answer](https://stackoverflow.com/a/16894322/645511) for why. – Katie Kilian Jul 25 '19 at 14:49
  • @CharlieKilian - This was asked a long, long, long time ago, but the main problem was because I was having to use the Compact Framework. If memory serves me well WebUtility wouldn't be available on CF. – ETFairfax Jul 25 '19 at 21:07
  • @ETFairfax That's fair. I wasn't leaving this comment to tell you you'd been wrong back then. I was leaving it because this had confused me as I was doing my own research, and once I'd found a better answer these eight years later, I thought I'd help out anyone who comes across it so they could get to the current answer faster. Definitely not a criticism of this question or its answers! In fact, you'd already got my +1. – Katie Kilian Jul 26 '19 at 14:44

6 Answers6

38

Or you could simply split your string and call Uri.EscapeDataString(string) for each block, in order to avoid reimplementing the function.

Sample Code:

        String value = "large string to encode";
        int limit = 2000;

        StringBuilder sb = new StringBuilder();
        int loops = value.Length / limit;

        for (int i = 0; i <= loops; i++)
        {
            if (i < loops)
            {
                sb.Append(Uri.EscapeDataString(value.Substring(limit * i, limit)));
            }
            else
            {
                sb.Append(Uri.EscapeDataString(value.Substring(limit * i)));
            }
        }
Alberto de Paola
  • 1,160
  • 2
  • 15
  • 29
  • 5
    the limit in .net 4.5 for EscapeDataString is 65520 characters - so that could be used to reduce the iterations needed. – Knaģis Dec 12 '12 at 13:54
  • Cool. Is there this kind of problem with Uri.Unescape? It seems that not, but I wonder just in case – Valentin Kuzub Mar 05 '14 at 05:14
  • @Knagis I am not sure why you mention number of iterations here since that can hardly be a big part of execution time. Initializing StringBuilder with size of value.Length definately sounds like a better performance boost. – Valentin Kuzub Mar 05 '14 at 05:32
  • 3
    Just an update: correct current limit of EscapeDataString in .NET 4.5 is 32766 characters (not 65520 as mentioned by @Knagi above): https://msdn.microsoft.com/en-us/library/system.uri.escapedatastring%28v=vs.110%29.aspx – Nick May 01 '15 at 06:34
  • 3
    @Nick if you actually try it, you may find that 65520 is the actual limit (exclusive, so 65519 is the most that will work) despite what the documentation says. – Jon Hanna Aug 31 '15 at 11:11
5

The answer of "Alberto de Paola" is good.

Nonetheless, to unescape the escaped data is little bit trickier, because you have to avoid cutting the encoded string at the middle of an encoded char (or you will break the integrity of the original string).

Here's my way of fixing this issue :

public static string EncodeString(string str)
{
    //maxLengthAllowed .NET < 4.5 = 32765;
    //maxLengthAllowed .NET >= 4.5 = 65519;
    int maxLengthAllowed = 65519;
    StringBuilder sb = new StringBuilder();
    int loops = str.Length / maxLengthAllowed;

    for (int i = 0; i <= loops; i++)
    {
        sb.Append(Uri.EscapeDataString(i < loops
            ? str.Substring(maxLengthAllowed * i, maxLengthAllowed)
            : str.Substring(maxLengthAllowed * i)));
    }

    return sb.ToString();
}

public static string DecodeString(string encodedString)
{
    //maxLengthAllowed .NET < 4.5 = 32765;
    //maxLengthAllowed .NET >= 4.5 = 65519;
    int maxLengthAllowed = 65519;

    int charsProcessed = 0;
    StringBuilder sb = new StringBuilder();

    while (encodedString.Length > charsProcessed)
    {
        var stringToUnescape = encodedString.Substring(charsProcessed).Length > maxLengthAllowed
            ? encodedString.Substring(charsProcessed, maxLengthAllowed)
            : encodedString.Substring(charsProcessed);

        // If the loop cut an encoded tag (%xx), we cut before the encoded char to not loose the entire char for decoding
        var incorrectStrPos = stringToUnescape.Length == maxLengthAllowed ? stringToUnescape.IndexOf("%", stringToUnescape.Length - 4, StringComparison.InvariantCulture) : -1;
        if (incorrectStrPos > -1)
        {
            stringToUnescape = encodedString.Substring(charsProcessed).Length > incorrectStrPos
                ? encodedString.Substring(charsProcessed, incorrectStrPos)
                : encodedString.Substring(charsProcessed);
        }

        sb.Append(Uri.UnescapeDataString(stringToUnescape));
        charsProcessed += stringToUnescape.Length;
    }

    var decodedString = sb.ToString();

    // ensure the string is sanitized here or throw exception if XSS / SQL Injection is found
    SQLHelper.SecureString(decodedString);
    return decodedString;
}

To test these functions :

var testString = "long string to encode";
var encodedString = EncodeString(testString);
var decodedString = DecodeString(encodedString);

Console.WriteLine(decodedString == testString ? "integrity respected" : "integrity broken");

Hope this can help avoiding some headaches ;)

Pouki
  • 1,654
  • 12
  • 18
  • This builds a better overall solution. I was getting bit by the split in the middle of a character to be translated. – user3841460 Nov 20 '18 at 23:17
2
StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < originalString.Length; i++)
{
    if ((originalString[i] >= 'a' && originalString[i] <= 'z') || 
        (originalString[i] >= 'A' && originalString[i] <= 'Z') || 
        (originalString[i] >= '0' && originalString[i] <= '9'))
    {
        stringBuilder.Append(originalString[i]);
    }
    else
    {
        stringBuilder.AppendFormat("%{0:X2}", (int)originalString[i]);
    }
}

string result = stringBuilder.ToString();
Doug
  • 6,322
  • 3
  • 29
  • 48
1

I have been using System.Web.HttpUtility.UrlEncode and seems to handle the longer strings much better.

themullet
  • 833
  • 8
  • 14
0

Use System.Web.HttpUtility.UrlEncode (based on this answer):

        value = HttpUtility.UrlEncode(value)
            .Replace("!", "%21")
            .Replace("(", "%28")
            .Replace(")", "%29")
            .Replace("*", "%2A")
            .Replace("%7E", "~"); // undo escape
Jeroen K
  • 10,258
  • 5
  • 41
  • 40
0

I needed another solution because the solution from Pouki does not work when Cyrillic is processed and symbol is cut.

The alternative solution is as follow:

    protected const int MaxLengthAllowed = 32765;
    private static string UnescapeString(string encodedString)
    {
        var charsProccessed = 0;

        var sb = new StringBuilder();

        while (encodedString.Length > charsProccessed)
        {
            var isLastIteration = encodedString.Substring(charsProccessed).Length < MaxLengthAllowed;

            var stringToUnescape = isLastIteration
                ? encodedString.Substring(charsProccessed)
                : encodedString.Substring(charsProccessed, MaxLengthAllowed);

            while (!Uri.IsWellFormedUriString(stringToUnescape, UriKind.RelativeOrAbsolute) || stringToUnescape.Length == 0)
            {
                stringToUnescape = stringToUnescape.Substring(0, stringToUnescape.Length - 1);
            }

            sb.Append(Uri.UnescapeDataString(stringToUnescape));
            charsProccessed += stringToUnescape.Length;
        }

        return sb.ToString();
    }
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
BIGDOGICO
  • 1
  • 2