0

I'm reviewing some existing code (converted from C#). It currently includes this expression:

Value.Substring(Value.IndexOf(symbol), 1)

where symbol is a Char from Value, a String. (Yes, it is from For Each symbol As Char In Value.)

As far as I can tell from MSDN this will be identical to:

symbol.ToString

(which will be much faster).

Can anyone suggest any situation when this won't be the case?

Note I do see the "flaw" in this code that means it may refer to an earlier version of symbol in Value than the one we're really considering, but that is part of my question: can this ever actually matter?

(Obviously another solution is to rework the code to For i = 0 To Value.Length - 1: symbol = Value(i): ... Value.Substring(i, 1) ... to keep the reference to the original string, just in case.)

    ''' <summary>
    ''' This is a different Url Encode implementation since the default .NET one outputs the percent encoding in lower case.
    ''' While this is not a problem with the percent encoding spec, it is used in upper case throughout OAuth
    ''' </summary>
    ''' <param name="Value">The value to Url encode</param>
    ''' <returns>Returns a Url encoded string</returns>
    Public Function UrlEncode(ByVal Value As String) As String
        Dim result As New StringBuilder()

        For Each symbol As Char In Value
            If unreservedChars.IndexOf(symbol) <> -1 Then
                result.Append(symbol)
            Else
                'some symbols produce > 2 char values so the system urlencoder must be used to get the correct data
                Dim hexAscW As String = String.Format("{0:X2}", CInt(AscW(symbol)))
                If hexAscW.Length > 3 Then
                    result.Append(HttpUtility.UrlEncode(Value.Substring(Value.IndexOf(symbol), 1)).ToUpper())
                Else
                    result.Append("%"c & hexAscW)
                End If
            End If
        Next
        Return result.ToString()
    End Function

I've already refactored the hexAscW which was originally a duplicated expression.

Mark Hurd
  • 10,665
  • 10
  • 68
  • 101
  • What are you actually trying to achieve? It sounds like you clearly have some bigger picture here, but it's not really clear what it is. – Jon Skeet Mar 11 '14 at 10:13
  • I'm afraid the question is not clear. Can you provide a sample string which demonstrates your concerns? – Tim Schmelter Mar 11 '14 at 10:13
  • Hmmm... Googling for `"Value.Substring(Value.IndexOf(symbol), 1)"` actually finds quite a few exact matches ... – Mark Hurd Mar 11 '14 at 10:13
  • @JonSkeet This is from an OAuth implementation I've just received. The comment for the `UrlEncode` implementation says it can't use `HttpUtility.UrlEncode` in general because it needs the percent encoding in upper case, so it only uses it when the `AscW` value is > `255` (although it determines it based upon the length of the hex value). It uses this expression to "locate" the character in the string to pass to `HttpUtility.UrlEncode`, the result of which it `.ToUpper`s. – Mark Hurd Mar 11 '14 at 10:20
  • 1
    @TimSchmelter I don't have an example. That's my question, is `Value.Substring(Value.IndexOf(symbol), 1)` ever not equivalent to `symbol.ToString` when `symbol` is definitely present in `Value`? – Mark Hurd Mar 11 '14 at 10:23
  • It's somewhat hard to follow the description - it would be much simpler if you could just post the code you're looking at. – Jon Skeet Mar 11 '14 at 10:27
  • @JonSkeet I've added the code. Note I know the comments regarding "some symbols produce > 2 char values" is effectively crap. Remember this is third party code twice removed :-) – Mark Hurd Mar 11 '14 at 10:32
  • Again "symbols produce > 2 char values" produces the same Google hits... – Mark Hurd Mar 11 '14 at 10:35
  • 1
    This code just looks confused. Not only is the index search unnecessary, the resulting index might be different than the index of `symbol`. Anyway, the question is valid. – usr Mar 11 '14 at 10:47
  • 1
    One significant issue here is how surrogate pairs are *meant* to be handled. I've no idea what URL encoding is meant to do at that point. I'd be interested in what *needs* the URL encoding to be upper case, too... – Jon Skeet Mar 11 '14 at 11:04
  • @JonSkeet It looks like this issue has been discussed on [SO](http://stackoverflow.com/q/918019/256431 ".net UrlEncode - lowercase problem"). – Mark Hurd Mar 11 '14 at 11:11
  • FYI My [final version](http://stackoverflow.com/a/25642368/256431) of this code. – Mark Hurd Sep 03 '14 at 10:48

1 Answers1

0

You have 2 ?
Answering the question in the title.
Does String.IndexOf(Char) always point to the same character Char?
From the documentation yes.

String.IndexOf Method (Char)

Reports the zero-based index of the first occurrence of the specified Unicode character in this string.

paparazzo
  • 44,497
  • 23
  • 105
  • 176
  • Where in the question have you identified that IndexOf does not behave as documented? "first occurrence" So the char may occur more than once in the string. Have you found a single case where IndexOf does not return the first occurrence? – paparazzo Mar 11 '14 at 12:23
  • I don't expect `IndexOf` to not behave as documented; that's why I believe the expression in the OP can be changed to just `symbol.ToString`. – Mark Hurd Mar 11 '14 at 12:29
  • Read your title. Does String.IndexOf(Char) always point to the same Char? – paparazzo Mar 11 '14 at 13:03
  • Reports the zero-based index of the first occurrence of the specified Unicode character in this string. – paparazzo Mar 11 '14 at 13:21
  • Because, as implied by other comments in this question and the one I linked to in my comment, the code needed much refinement, and noone has suggested there's a reason why my optimisation would have been wrong in isolation, I have given you the green tick. – Mark Hurd Mar 20 '14 at 02:12