15

I have a lot of if, else if statements and I know there has to be a better way to do this but even after searching stackoverflow I'm unsure of how to do so in my particular case.

I am parsing text files (bills) and assigning the name of the service provider to a variable (txtvar.Provider) based on if certain strings appear on the bill.

This is a small sample of what I'm doing (don't laugh, I know it's messy). All in all, There are approximately 300 if, else if's.

if (txtvar.BillText.IndexOf("SWGAS.COM") > -1)
{
    txtvar.Provider = "Southwest Gas";
}
else if (txtvar.BillText.IndexOf("georgiapower.com") > -1)
{
    txtvar.Provider = "Georgia Power";
}
else if (txtvar.BillText.IndexOf("City of Austin") > -1)
{
    txtvar.Provider = "City of Austin";
}
// And so forth for many different strings

I would like to use something like a switch statement to be more efficient and readable but I'm unsure of how I would compare the BillText. I'm looking for something like this but can't figure out how to make it work.

switch (txtvar.BillText)
{
    case txtvar.BillText.IndexOf("Southwest Gas") > -1:
        txtvar.Provider = "Southwest Gas";
        break;
    case txtvar.BillText.IndexOf("TexasGas.com") > -1:
        txtvar.Provider = "Texas Gas";
        break;
    case txtvar.BillText.IndexOf("Southern") > -1:
        txtvar.Provider = "Southern Power & Gas";
        break;
}

I'm definitely open to ideas.

I would need the ability to determine the order in which the values were evaluated. As you can imagine, when parsing for hundreds of slightly different layouts I occasionally run into the issue of not having a distinctly unique indicator as to what service provider the bill belongs to.

Milne
  • 850
  • 1
  • 11
  • 28

8 Answers8

23

Why not use everything C# has to offer? The following use of anonymous types, collection initializers, implicitly typed variables, and lambda-syntax LINQ is compact, intuitive, and maintains your modified requirement that patterns be evaluated in order:

var providerMap = new[] {
    new { Pattern = "SWGAS.COM"       , Name = "Southwest Gas" },
    new { Pattern = "georgiapower.com", Name = "Georgia Power" },
    // More specific first
    new { Pattern = "City of Austin"  , Name = "City of Austin" },   
    // Then more general
    new { Pattern = "Austin"          , Name = "Austin Electric Company" }   
    // And for everything else:
    new { Pattern = String.Empty      , Name = "Unknown" }
};

txtVar.Provider = providerMap.First(p => txtVar.BillText.IndexOf(p.Pattern) > -1).Name; 

More likely, the pairs of patterns would come from a configurable source, such as:

var providerMap =
    System.IO.File.ReadLines(@"C:\some\folder\providers.psv")
    .Select(line => line.Split('|'))
    .Select(parts => new { Pattern = parts[0], Name = parts[1] }).ToList();

Finally, as @millimoose points out, anonymous types are less useful when passed between methods. In that case we can define a trival Provider class and use object initializers for nearly identical syntax:

class Provider { 
    public string Pattern { get; set; } 
    public string Name { get; set; } 
}

var providerMap =
    System.IO.File.ReadLines(@"C:\some\folder\providers.psv")
    .Select(line => line.Split('|'))
    .Select(parts => new Provider() { Pattern = parts[0], Name = parts[1] }).ToList();
Mark Hurd
  • 10,665
  • 10
  • 68
  • 101
Joshua Honig
  • 12,925
  • 8
  • 53
  • 75
  • +1 - I like this, although it doesn't account for a no-match scenario (which you could easily do with `FirstOrDefault()`, and then a null check). – Tim M. Sep 11 '13 at 23:37
  • I don't believe the OP has yet confirmed (or denied) that the order in which patterns should be examined is important. – millimoose Sep 11 '13 at 23:39
  • +1: Another good approach, although initialization is more typing than `string[][2]`. However, I'll admit that it's a more general solution because it allows the key and value to be different types. – Jim Mischel Sep 11 '13 at 23:41
  • Aside: If I Were The OP, I'd be maintaining the mapping in a configuration file of some sort instead of hardcoding it, in which case the *anonymous* type wouldn't really be useful. – millimoose Sep 11 '13 at 23:44
  • @millimoose Anonymous types would still be useful; see the edit – Joshua Honig Sep 12 '13 at 00:07
  • 1
    @TimMedora If the `Pattern` of the last `Provider` is the empty string it will guarantee a catch-all match without having to worry about a `null` result (which would have no `Name` property) – Joshua Honig Sep 12 '13 at 00:08
  • @JoshuaHonig That's assuming you're reading the configuration file in the very method that uses the resulting dictionary. This means you'd be re-reading the configuration everytime it's used to fetch a provider. Surely you see the issue here - a configuration file is usually parsed once and then the in-memory representation is cached. – millimoose Sep 12 '13 at 00:08
  • 1
    @millimoose Aah..I see what you're saying (pointing out the difficulty of exposing anonymous types to other methods). In that case a trivial POCO class would suffice as well as the anonymous type used here. I just wanted to present a little "best-of-C#" showcase here for teaching purposes. – Joshua Honig Sep 12 '13 at 00:12
  • In the first solution, you don't need the .ToList(), it's already an array. – Niall Connaughton Sep 12 '13 at 00:51
  • Could this idea still work when the else if's may contain a range? Would be nice to move away from lots of if else statements e.g. if(totalWeight == 0) { shippingPrice = 0; } else if (totalWeight <= 2) { shippingPrice = 5.50m; } etc etc etc – Michael Harper Jul 11 '18 at 14:19
15

Since you seem to need to search for the key before returning the value a Dictionary is the right way to go, but you will need to loop over it.

// dictionary to hold mappings
Dictionary<string, string> mapping = new Dictionary<string, string>();
// add your mappings here
// loop over the keys
foreach (KeyValuePair<string, string> item in mapping)
{
    // return value if key found
    if(txtvar.BillText.IndexOf(item.Key) > -1) {
        return item.Value;
    }
}

EDIT: If you wish to have control over the order in which elemnts are evaluated, use an OrderedDictionary and add the elements in the order in which you want them evaluated.

Serdalis
  • 10,296
  • 2
  • 38
  • 58
  • 4
    The disadvantage of a dictionary is that you don't have any control over the order of the evaluation. You may have a case where "SCE" goes to "Southern California Edison" and "SCEC" goes to "South Carolina Electric Company". You want to look for SCEC before SCE. Better to use a `List` structure instead. You don't need fast lookup that the Dictionary hash provides. – Mark Lakata Sep 11 '13 at 23:14
  • 4
    @MarkLakata [OrderedDictionary](http://msdn.microsoft.com/en-us/library/system.collections.specialized.ordereddictionary.aspx) – Asad Saeeduddin Sep 11 '13 at 23:16
  • There was no indication of such a restriction. But OrderedDictionary came to mind too, it's just less standard and seemingly unnecessary in this context. But good point, I'll add an edit. – Serdalis Sep 11 '13 at 23:18
  • @MarkLakata I strongly strongly doubt the ordering matters in this case. – millimoose Sep 11 '13 at 23:19
  • @MarkLakata I'd also argue in this case it's better to ask the OP for clarification rather than read too much into their choice of control structure (and honestly, 400 lines of `else if`s is rarely a deliberate choice) and knock answers on an assumption. – millimoose Sep 11 '13 at 23:30
  • +1: A couple of other ways to do this would be a `string[][2]` or a `Tuple[]`, both of which would give you a defined order. Although I agree that order is unlikely to be relevant in the OP's particular case. – Jim Mischel Sep 11 '13 at 23:31
  • I'd use a SortedDictionary with a custom comparer which first compares by length(reverse) and then by contents.It's actually faster than normal string comparison. – Behrooz Sep 12 '13 at 00:43
  • Mark's point that you're not benefiting from the fast lookup of Dictionary is still true. This would have the same performance profile if you used a List structure as he suggested - because you're just foreaching through the set of pairs, not looking them up by key. – Niall Connaughton Sep 12 '13 at 00:48
  • @NiallConnaughton So you suggest OP to not use dictionary because he won't need one of the features it provides?you should really consider the cleaner code it results in. – Behrooz Sep 12 '13 at 00:52
  • @NiallConnaughton I shy away from collections containing pairs as much as I can. It just feels cleaner, Dictionary is also a more language agnostic way of doing this but his point is valid. – Serdalis Sep 12 '13 at 01:06
  • The code is going to end up looking the same - add to a dictionary, or add to a list, or create an array with a collection initialiser. See Joshua's answer for a nicely designed array of substitutions. It's misleading to the reader to use a dictionary when the only access to it is using it as a plain enumerable. – Niall Connaughton Sep 12 '13 at 06:49
10

One more using LINQ and Dictionary

var mapping = new Dictionary<string, string>()
                        {
                            { "SWGAS.COM", "Southwest Gas" },
                            { "georgiapower.com", "Georgia Power" }
                            .
                            .
                        };

return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
              .Select(pair => pair.Value)
              .FirstOrDefault();

If we prefer empty string instead of null when no key matches we can use the ?? operator:

return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
              .Select(pair => pair.Value)
              .FirstOrDefault() ?? "";

If we should consider the dictionary contains similar strings we add an order by, alphabetically, shortest key will be first, this will pick 'SCE' before 'SCEC'

return mapping.Where(pair => txtvar.BillText.IndexOf(pair.Key) > -1)
              .OrderBy(pair => pair.Key)
              .Select(pair => pair.Value)
              .FirstOrDefault() ?? "";
Tommy Grovnes
  • 4,126
  • 2
  • 25
  • 40
7

To avoid the blatant Schlemiel the Painter's approach that looping over all the keys would involve: let's use regular expressions!

// a dictionary that holds which bill text keyword maps to which provider
static Dictionary<string, string> BillTextToProvider = new Dictionary<string, string> {
    {"SWGAS.COM", "Southwest Gas"},
    {"georgiapower.com", "Georgia Power"}
    // ...
};

// a regex that will match any of the keys of this dictionary
// i.e. any of the bill text keywords
static Regex BillTextRegex = new Regex(
    string.Join("|", // to alternate between the keywords
                from key in BillTextToProvider.Keys // grab the keywords
                select Regex.Escape(key))); // escape any special characters in them

/// If any of the bill text keywords is found, return the corresponding provider.
/// Otherwise, return null.
string GetProvider(string billText) 
{
    var match = BillTextRegex.Match(billText);
    if (match.Success) 
        // the Value of the match will be the found substring
        return BillTextToProvider[match.Value];
    else return null;
}

// Your original code now reduces to:

var provider = GetProvider(txtvar.BillText);
// the if is be unnecessary if txtvar.Provider should be null in case it can't be 
// determined
if (provider != null) 
    txtvar.Provider = provider;

Making this case-insensitive is a trivial exercise for the reader.

All that said, this does not even pretend to impose an order on which keywords to look for first - it will find the match that's located earliest in the string. (And then the one that occurs first in the RE.) You do however mention that you're searching through largeish texts; if .NET's RE implementation is at all good this should perform considerably better than 200 naive string searches. (By only making one pass through the string, and maybe a little by merging common prefixes in the compiled RE.)

If ordering is important to you, you might want to consider looking for an implementation of a better string search algorithm than .NET uses. (Like a variant of Boyer-Moore.)

millimoose
  • 39,073
  • 9
  • 82
  • 134
4

What you want is a Dictionary:

Dictionary<string, string> mapping = new Dictionary<string, string>();
mapping["SWGAS.COM"] = "Southwest Gas";
mapping["foo"] = "bar";
... as many as you need, maybe read from a file ...

Then just:

return mapping[inputString];

Done.

i_am_jorf
  • 53,608
  • 15
  • 131
  • 222
  • 7
    Right track but not really correct. There is no equivalence between the intputstring and the key. Notice that the OP use IndexOf – Steve Sep 11 '13 at 23:06
  • 2
    You're not addressing how the OP could remove the if else structure from their code. All you've done is made the `txtvar.Provider = "CPS Energy";` a bit cleaner, i.e. `txtvar.Provider = mapping[searchString];` – Asad Saeeduddin Sep 11 '13 at 23:06
4

One way of doing it (other answers show very valid options):

void Main()
{
    string input = "georgiapower.com";
    string output = null;

    // an array of string arrays...an array of Tuples would also work, 
    // or a List<T> with any two-member type, etc.
    var search = new []{
        new []{ "SWGAS.COM", "Southwest Gas"},
        new []{ "georgiapower.com", "Georgia Power"},
        new []{ "City of Austin", "City of Austin"}
    };

    for( int i = 0; i < search.Length; i++ ){

        // more complex search logic could go here (e.g. a regex)
        if( input.IndexOf( search[i][0] ) > -1 ){
            output = search[i][1];
            break;
        }
    }

    // (optional) check that a valid result was found.
    if( output == null ){
        throw new InvalidOperationException( "A match was not found." );
    }

    // Assign the result, output it, etc.
    Console.WriteLine( output );
}

The main thing to take out of this exercise is that creating a giant switch or if/else structure is not the best way to do it.

Tim M.
  • 53,671
  • 14
  • 120
  • 163
  • Case insensitive matching can be accomplished with a dictionary. – Asad Saeeduddin Sep 11 '13 at 23:08
  • @Asad - thanks. Never knew that but you are right: http://stackoverflow.com/questions/6676245/c-sharp-dictionary-making-the-key-case-insensitive-through-declarations – Tim M. Sep 11 '13 at 23:10
  • 1
    You could even implement the fuzzy matching in an `IEqualityComparer` and plug that into the dictionary. – Asad Saeeduddin Sep 11 '13 at 23:11
  • 1
    +1 because this evaluates the same as the original if/else but is easier to read. The dictionary method doesn't do the same thign. – Mark Lakata Sep 11 '13 at 23:16
  • 3
    @MarkLakata Although I agree that the traditional use of a dictionary would be more restrictive than this method, the way I have used it in my answer is identical in function to how Tim has used an array. We are just storing the elements differently. Logically the answers are equivalent. – Serdalis Sep 11 '13 at 23:27
1

There are several approaches to do this, but for the reason of simplicity, conditional operator may be a choice:

Func<String, bool> contains=x => {
    return txtvar.BillText.IndexOf(x)>-1;
};

txtvar.Provider=
    contains("SWGAS.COM")?"Southwest Gas":
    contains("georgiapower.com")?"Georgia Power":
    contains("City of Austin")?"City of Austin":
    // more statements go here 
    // if none of these matched, txtvar.Provider is assigned to itself
    txtvar.Provider;

Note the result is according to the more preceded condition which is met, so if txtvar.BillText="City of Austin georgiapower.com"; then the result would be "Georgia Power".

Ken Kin
  • 4,503
  • 3
  • 38
  • 76
0

you can use dictionary.

Dictionary<string, string> textValue = new Dictionary<string, string>();
foreach (KeyValuePair<string, string> textKey in textValue)
{
  if(txtvar.BillText.IndexOf(textKey.Key) > -1) 
   return textKey.Value;

}
Suman Banerjee
  • 1,923
  • 4
  • 24
  • 40