1

I'm doing a tag-like string match function where function checks is string contains any of possible words while maintaining their order, at least per tag. I figured out it will be best to precreate list of possibilities and on check simply see if string contains each of required combinations

Maybe code will make it more clear.

List<List<string[]>> tags;

List<string[]> innerList;

List<List<string>> combinationsList;

public void Generate(string pattern)
{
    // i will add whitespace removal later so it can be ", " instead of only ","

    foreach (string tag in pattern.Split(','))
    {
        innerList = new List<string[]>();

        foreach (var varyword in tag.Split(' '))
        {
            innerList.Add(varyword.Split('|'));
        }
    }

    // atm i lack code to generate combinations in form of List<List<string>> 
    // and drop them into 'combinationsList'
}

// the check function will look something like isMatch = :
public bool IsMatch(string textToTest)
{
    return combinationsList.All(tc => tc.Any(c => textToTest.Contains(c)));
}

So for example pattern:

"old|young john|bob, have|posses dog|cat"

  • tags:
    • List_1:
      • {old, young}
      • {john, bob}
    • List_2
      • {have, posses}
      • {dog, cat}

So combinationsList will have:

  • combinationsList :
    • List_1
      • "old john"
      • "old bob"
      • "young john"
      • "young bob"
    • List_2
      • "have dog"
      • "have cat"
      • "posses dog"
      • "posses cat"

So results will be:

  • old bob have cat = true, contains List_1:"old bob" and List_2:"have cat"
  • young john have car = false, contains List_1:"young john" but doesn't contain any of List_2 combinations

I cannot figure out how to iterate collection to get those combinations and how to get the combination per iteration. Also i cannot mess up the order so old john will not be also generated as john old.

Please note that any of "variant words" in pattern may have more than 2 variants like for example "dog|cat|mouse"

Nerfpl
  • 167
  • 3
  • 8

2 Answers2

2

This code may help

string pattern = "old|young john|bob have|posses dog|cat";
var lists = pattern.Split(' ').Select(p => p.Split('|'));

foreach (var line in CartesianProduct(lists))
{
    Console.WriteLine(String.Join(" ",line));
}


//http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
    // base case:
    IEnumerable<IEnumerable<T>> result = new[] { Enumerable.Empty<T>() };
    foreach (var sequence in sequences)
    {
        var s = sequence; // don't close over the loop variable
        // recursive case: use SelectMany to build the new product out of the old one
        result =
            from seq in result
            from item in s
            select seq.Concat(new[] { item });
    }
    return result;
}
I4V
  • 34,891
  • 6
  • 67
  • 79
  • To be honest, i would prefer simple foreach iterations if it's possible so i can understand clearly what is going on. also please note that combinations should not be in reverse order so A|B C|D will not yield C A or D B etc I know it sounds little ignorant but i prefer to learn step by step instead of simply copy paste :) – Nerfpl Apr 26 '13 at 15:57
  • @Nerfpl If you see the output, combinations are not reversed. they preserve the order. But I can't do anything for simple iterations(sorry, too lazy for this). Maybe someone would post that solution. – I4V Apr 26 '13 at 16:06
0

I found an answer in another thread.

https://stackoverflow.com/a/11110641/1156272

The code posted by Adam works flawlessly and does exactly what i needed

        foreach (var tag in pattern.Split(','))
        {
            string tg = tag;
            while (tg.StartsWith(" ")) tg = tg.Remove(0,1);
            innerList = new List<List<string>>();

            foreach (var varyword in tg.Split(' '))
            {
                innerList.Add(varyword.Split('|').ToList<string>());
            }

            //Adam's code

            List<String> combinations = new List<String>();
            int n = innerList.Count;
            int[] counter = new int[n];
            int[] max = new int[n];
            int combinationsCount = 1;
            for (int i = 0; i < n; i++)
            {
                max[i] = innerList[i].Count;
                combinationsCount *= max[i];
            }
            int nMinus1 = n - 1;
            for (int j = combinationsCount; j > 0; j--)
            {
                StringBuilder builder = new StringBuilder();
                for (int i = 0; i < n; i++)
                {
                    builder.Append(innerList[i][counter[i]]);
                    if (i < n - 1) builder.Append(" "); //my addition to insert whitespace between words
                }
                combinations.Add(builder.ToString());

                counter[nMinus1]++;
                for (int i = nMinus1; i >= 0; i--)
                {
                    // overflow check
                    if (counter[i] == max[i])
                    {
                        if (i > 0)
                        {
                            // carry to the left
                            counter[i] = 0;
                            counter[i - 1]++;
                        }
                    }
                }
            }

            //end

            if(combinations.Count > 0)
                combinationsList.Add(combinations);
        }
    }

    public bool IsMatch(string textToCheck)
    {
        if (combinationsList.Count == 0) return true;

        string t = _caseSensitive ? textToCheck : textToCheck.ToLower();

        return combinationsList.All(tg => tg.Any(c => t.Contains(c)));
    }

Looks like magic but it works. Thanks everyone

Community
  • 1
  • 1
Nerfpl
  • 167
  • 3
  • 8