6

I'm need to convert some string comparisons from vb to c#. The vb code uses > and < operators. I am looking to replace this with standard framework string comparison methods. But, there's a behaviour I don't understand. To replicate this, I have this test

[TestMethod]
public void TestMethod2()
{
    string originalCulture = CultureInfo.CurrentCulture.Name; // en-GB

    var a = "d".CompareTo("t");  // returns -1
    var b = "T".CompareTo("t");  // returns 1

    Assert.IsTrue(a < 0, "Case 1");
    Assert.IsTrue(b <= 0, "Case 2");
}

Could someone explain why b is returning 1. My current understanding is that if it is case sensitive then "T" should precede "t" in the sort order i.e. -1. If it is case insensitive it would be the same i.e. 0

(FYI .Net Framework 4.5.2)

Many thx

Simon Woods
  • 905
  • 1
  • 6
  • 13
  • Can you also show the VB code? – Tim Schmelter Jul 21 '17 at 10:19
  • 2
    Why do you specifically expect upper case to precede lower case? (It does if you use an *ordinal* comparison, admittedly.) – Jon Skeet Jul 21 '17 at 10:19
  • Does vb code return other then `1` in `"T".CompareTo("t")`? – Fabio Jul 21 '17 at 10:21
  • 4
    From https://msdn.microsoft.com/en-us/library/35f0x18w(v=vs.110).aspx "This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture". It then recommends reading https://msdn.microsoft.com/en-us/library/84787k22(v=vs.110).aspx for more info, which states "The comparison uses the current culture to obtain culture-specific information such as casing rules". So it could depend on what your default culture is and what its rules are. You could override the defaults by using this method https://msdn.microsoft.com/en-us/library/e6883c06(v=vs.110).aspx – ADyson Jul 21 '17 at 10:24
  • 1
    Use `int a = string.Compare("d", "t", StringComparison.Ordinal)` which returns -16 and `int b = string.Compare("T", "t", StringComparison.Ordinal);` that returns -32. If you dont specify the `StringComparison` then `CurrentCulture` is used. – Tim Schmelter Jul 21 '17 at 10:28
  • I included current culture purely for info – Simon Woods Jul 21 '17 at 10:33
  • It looks like for English culture we go with lowercase first. Shrug. :) – Derek Jul 21 '17 at 10:34
  • @JonSkeet - I was thinking ASCII. I couldn't really see how T could ever follow t in the sort order on the basis of ASCII - irrespective of case – Simon Woods Jul 21 '17 at 10:34
  • 1
    Right, but the comparison you're performing is a cultural one. If you *ask* for an ordinal comparison, (e.g. `string.Compare("T", "t", StringComparison.Ordinal)`) then that will give a different result. – Jon Skeet Jul 21 '17 at 10:35
  • @TimSchmelter - thx. – Simon Woods Jul 21 '17 at 10:35
  • @JonSkeet - so Derek was right ... for we uk-ers, culturally "t" precedes before "T"? – Simon Woods Jul 21 '17 at 10:40
  • @SimonWoods: That's certainly what .NET appears to believe, yes. (Same for the invariant culture.) If you want the details, CLDR probably has them... – Jon Skeet Jul 21 '17 at 10:45
  • Imperial code pages? ;-) thx everyone – Simon Woods Jul 21 '17 at 10:52
  • Hello @SimonWoods I have read on a Microsoft forum that you managed to create a usercontrol for WPF TextBox in WinForms to be used in VB6. I am looking for the same. Do you think you would perhaps share you control with me? This would help me really much, I think. Thank you very much for a reply! (Didn't know how to contact you otherwise than here :-)) – tmighty Feb 10 '22 at 11:15

3 Answers3

1

Lower case comes before upper case . That's true both for en-GB and for InvariantCulture.

If you want to the ASCII like behavior you should specify the additional CompareOptions.Ordinal parameter

See the following:

Sample code on repl.it:

using System;
using System.Globalization;
using System.Collections.Generic;

class MainClass
{
    public static void Main(string[] args)
    {

        //All the following case sensitive comparisons puts d before D
        Console.WriteLine("D".CompareTo("d"));
        Console.WriteLine(String.Compare("D", "d", false));
        Console.WriteLine(String.Compare("D", "d", false, CultureInfo.InvariantCulture));

        //All the following case sensitive comparisons puts capital D before small letter d
        Console.WriteLine(String.Compare("D", "d", CultureInfo.InvariantCulture, CompareOptions.Ordinal));

        //The following is case insensitive
        Console.WriteLine(String.Compare("D", "d", true));

        //The default string ordering in my case is d before D
        var list = new List<string>(new[] { "D", "d" });
        list.Sort();
        foreach (var s in list)
        {
            Console.WriteLine(s);
        }
    }
}

//Results on repl.it
//Mono C# compiler version 4.0.4.0
//   
//1
//1
//1
//-32
//0
//d
//D

Good luck

Eyal

0

The sort order is based on either a binary comparison or a textual comparison depending on the setting of Option Compare. - https://learn.microsoft.com/en-us/dotnet/visual-basic/programming-guide/language-features/operators-and-expressions/comparison-operators

As can be seen there are two ways of comparing strings... So which is default?

The Option Compare statement specifies the string comparison method (Binary or Text). The default text comparison method is Binary. - https://learn.microsoft.com/en-us/dotnet/visual-basic/language-reference/statements/option-compare-statement

As can be seen Binary is the default comparison. This means that any capital letter would be seen as less than any lower case letter. ie "T"<"t" is true but also unintuitively "Z"<"t" and "A"<"t" are also both true.

So it is likely that VB was actually not doing what you expected it to and was using a comparison equivalent to passing StringComparer.Ordinal to the String.Compare method. Note I say not what you expected in that normally when people compare strings they expect the "normal" text comparisons.

Chris
  • 27,210
  • 6
  • 71
  • 92
  • "If you really want to replicate this behaviour then you can for a charby casting them to an int before comparison but for a longer string it would be much trickier to replicate." - no, to perform an ordinal comparison, you use `StringComparison.Ordinal`, assuming you're content with the "binary" representation being "UTF-16". – Jon Skeet Jul 21 '17 at 10:36
  • @JonSkeet: Yeah, I was just editing that. I had a moment. ;-) – Chris Jul 21 '17 at 10:36
0

If not specified CompareTo(strA, strB) uses

CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);

For en-GB lowercase letters are smaller than uppercase ones.

Janos
  • 165
  • 10