55

Using C# and .NET 3.5, what's the best way to handle this situation. I have hundreds of fields to compare from various sources (mostly strings). Sometimes the source returns the string field as null and sometimes as empty. And of course, sometimes there is text in the fields. My current comparison of strA != strB isn't cutting it because strA is null and strB is "", for example. I know I could do the string.IsNullOrEmpty which results in a double comparison and some ugliness. Is there a better way to handle this? I thought extension methods, but you can't extend operators.

I guess I'm looking for a sexy way to do this.

billb
  • 3,608
  • 1
  • 32
  • 36

8 Answers8

92

Doesn't eliminate the extra underlying comparisons, but for the sexiness factor, you could use something like this:

(strA ?? "") == (strB ?? "")

or the slightly less sexy, but preferable form:

(strA ?? string.Empty) == (strB ?? string.Empty)
iammichael
  • 9,477
  • 3
  • 32
  • 42
  • Well, I was trying to get around redoing hundreds of comparisons with an extension method, but I suppose I'm going to have to bite the bullet. I may end up writing a method that performs this behind the scenes. Thanks. – billb Nov 25 '09 at 15:00
  • The `string.Empty` isn't _quite_ as sexy (less compact) which is why I included both, but yeah, using `string.Empty` is preferred. – iammichael Nov 25 '09 at 15:07
  • "string.Empty is preferred" – Kevin Whitefoot Dec 04 '14 at 11:49
  • 5
    Damn! Hit the enter key at the wrong moment. Meant to say: "string.Empty is preferred" Why? It is longer, uglier and makes no difference to the run time behaviour. See other SO entries about String.Empty and [link](http://blog.codinghorror.com/micro-optimization-and-meatballs/). As far as I am concerned "" is perfectly readable and preferable to String.Empty. – Kevin Whitefoot Dec 04 '14 at 12:00
  • Readable? Maybe. Preferable? Not always. string.Empty does not allocate a new immutable string instance used for interning (which might equal to one additional object reference per-assembly or per-AppDomain not sure if either of those are true). In terms of raw performance (at the time of that article and on machines/runtimes tested), there are alternatives that yield slightly better results. I prefer string.Empty > String.Empty > "" simply because I prefer "string" over "String" and an inline constant ("") would normally be placed into a const and my refactor senses are left tingling!! – Graeme Wicksted Mar 11 '15 at 16:42
70

Since you've got hundreds of comparisons to do, it sounds like you want a single function to call so that you can reduce the clutter and repetition in your code. I don't think there is a built-in function to do a null/empty string/comparison check all in one, but you could just make one yourself:

static class Comparison
{
    public static bool AreEqual(string a, string b)
    {
        if (string.IsNullOrEmpty(a))
        {
            return string.IsNullOrEmpty(b);
        }
        else
        {
            return string.Equals(a, b);
        }
    }
}

Then you could just use a single call to your function for each comparison:

        if(Comparison.AreEqual(strA[0], strB[0])) { // ... }
        if(Comparison.AreEqual(strA[1], strB[1])) { // ... }
        if(Comparison.AreEqual(strA[2], strB[2])) { // ... }
        if(Comparison.AreEqual(strA[3], strB[3])) { // ... }

This approach is also easier to expand if you later find that you need to worry about additional situations, such as ignoring whitespace at the beginning or ending of strings; you can then just add more logic to your function to do some trimming or whatever and you won't have to make any modifications to the hundreds of lines of code calling your function.

Dr. Wily's Apprentice
  • 10,212
  • 1
  • 25
  • 27
  • 1
    String.Equals(a,b) handles null comparison as well, therefore can get rid of the string.IsNullOrEmpty() check https://msdn.microsoft.com/en-us/library/1hkt4325(v=vs.90).aspx – abhaybhatia Oct 06 '16 at 16:06
  • 4
    @abhaybhatia String.Equals(a,b) does not consider null to be equal to empty string, which is what was desired. Therefore, the string.IsNullOrEmpty check is necessary. – Dr. Wily's Apprentice Oct 15 '16 at 00:36
12

Not as sexy as ??, but you could avoid the double comparison part of the time if you short-circuit it:

string.IsNullOrEmpty( strA ) ? string.IsNullOrEmpty( strB ) : (strA == strB )
Eric
  • 11,392
  • 13
  • 57
  • 100
6

What about

strA ?? "" == strB ?? ""
Robin Day
  • 100,552
  • 23
  • 116
  • 167
3

Addition After a few years, and writing several equality comparers, my opinion has changed, such that I think it is better for the equality comparer to have a static member that holds the created comparer, instead of every user creating a new instance.


(original answer, with the adjustment mentioned above)

The solutions others gave, including the one that proposes to define a Comparison class for the strings, forgot to write a new GetHashCode for your strings.

This means that your string can't be used in classes that depend upon GetHashCode like Dictionary<T> or HashSet<T>.

See Why is it important to override GetHashCode when Equals method is overridden?

Whenever you decide to change the concept of equality for any class, you should write an EqualityComparer for that class. This makes sure that if according to your changed definition of equality to objects are considered equal, their GetHashCode would return equal values.

public class NullStringComparer : EqualityComparer<string>
{
    public static IEqualityComparer<string> NullEqualsEmptyComparer {get} = new NullStringComparer();

    public override bool Equals(string x, string y)
    {
        // equal if string.Equals(x, y)
        // or both StringIsNullOrEmpty
        return String.Equals(x, y)
            || (String.IsNullOrEmpty(x) && String.IsNullOrEmpty(y));
    }

    public override int GetHashCode(string obj)
    {
        if (String.IsNullOrEmpty(obj))
           return 0;
        else
            return obj.GetHashCode();
    }
}

Usage:

public static void Main()
{
    string x = null;
    string y = String.Empty;

    Console.WriteLine("Standard string comparison: {0}", 
        StringComparer.Ordinal.Equals(x, y));

    Console.WriteLine($"My string comparison {0}",
        NullStringComparer.NullEqualsEmpty.Equals(x, y));

    // because according to the NullStringComparer x equals y
    // GetHashCode should return the same value
    int hashX = NullStringComparer.NullEqualsEmpty.GetHashCode(x);
    int hashY = NullStringComparer.NullEqualsEmpty.GetHashCode(y);
    Console.WriteLine($"hash X = {hashX}, hash Y = {hashY}");
} 
Harald Coppoolse
  • 28,834
  • 7
  • 67
  • 116
3

This also works, ignoring cases

(strA ?? "").Equals(strB ?? "", StringComparison.OrdinalIgnoreCase)
2

What's wrong with string.IsNullOrEmpty()? I'm sure that since it is part of the .NET framework it's optimized and probably far more efficient than something you or I could write. It may not be sexy but it works. Write code that is easily readable and let the compiler sort out the details.

TLiebe
  • 7,913
  • 1
  • 23
  • 28
  • Because there are two checks, and I have to now replace hundreds of comparisons with all of this noise. – billb Nov 25 '09 at 14:59
  • That may be but couldn't a find a replace help you out with updating the code? – TLiebe Nov 25 '09 at 15:02
  • Sure, but aren't you going to be replacing hundreds of comparisons anyway? How do you plan to do this replacement of `!=` without a manual effort? – Chris Farmer Nov 25 '09 at 15:02
  • +1 for "Write code that is easily readable and let the compiler sort out the details". Question looks like a premature optimisation IMHO – Binary Worrier Nov 25 '09 at 15:15
  • Overloading != somehow would get around the code change. Yes, search and replace would also help, but I said "sexy" dammit. LOL! – billb Nov 25 '09 at 15:15
  • 1
    @billb: Sexy fast or sexy pretty? 10,000,000 iterations of if ((s1 ?? "") == (s2 ?? "")) take approx 155 milliseconds, where 10,000,000 iterations of if (string.IsNullOrEmpty(s1) && string.IsNullOrEmpty(s2)) takes 145 milliseconds (where one string is empty and the other is null). I agree the difference isn't worth worrying about, I'm just pointing out that "better" is always in the context of competing requirements. – Binary Worrier Nov 25 '09 at 15:31
  • Thanks for the comparison of the two methods Binary Worrier. – TLiebe Nov 25 '09 at 15:52
0

If you your 2 sets of fields are in some sort of collection, you may be able to use LINQ to your advantage. If they are in some sort of collection that allows you to access them by key and they both have the same keys, you can use this (ready to be pasted into LINQPad):

Dictionary<string,string> fields1 = new Dictionary<string,string>();
Dictionary<string,string> fields2 = new Dictionary<string,string>();

fields1.Add("field1", "this");
fields2.Add("field1", "this");
fields1.Add("field2", "is");
fields2.Add("field2", "");
fields1.Add("field3", "a");
fields2.Add("field3", null);
fields1.Add("field4", "test");
fields2.Add("field4", "test");

var test = 
from f1 in fields1
    join f2 in fields2
    on f1.Key equals f2.Key
select (f1.Value ?? "") == (f2.Value ?? "");

test.Dump();

If you have the sets of fields in 2 indexed collections in the same order, you could use something like this:

string[] strings1 = { "this", "is", "a", "test" };
string[] strings2 = { "this", "", null, "test" };

var test = 
from s1 in strings1.Select((value,index) => new {value, index})
    join s2 in strings2.Select((value,index) => new {value, index})
    on s1.index equals s2.index
select (s1.value ?? "") == (s2.value ?? "");

test.Dump();
Chris Breish
  • 327
  • 1
  • 6