1

In my SQL Server database I have strings stored representing the correct solution to a question. In this string a certain format can be used to represent multiple correct solutions. The format:

possible-text [posibility-1/posibility-2] possible-text

This states either posibility-1 or posibility-2 is correct. There is no limit on how many possibilities there are (e.g. [ pos-1 / pos-2 / pos-3 / ... ] is possible). However, a possibility can be null, e.g.:

I am [un/]certain.

This means the answer could be "I am certain" or "I am uncertain". The format can also be nested in a sentence, e.g.:

I am [[un/]certain/[un/]sure].

The format can also occur multiple times in one sentence, e.g.:

[I am/I'm] [[un/]certain/[/un]sure].

What I want is to generate all the possible combinations. E.g. the above expression should return:

I am uncertain.
I am certain.
I am sure.
I am unsure.
I'm uncertain.
I'm certain.
I'm sure.
I'm unsure.

There is no limit on the nesting, nor the amount of possibilities. If there is only one possible solution then it will have not be in the above format. I'm not sure on how to do this.

I have to write this in C#. I think a possible solution could be to write a regex expression that can capture the [ / ] format and return me the possible solutions in a list (for every []-pair) and then generating the possible solutions by going over them in a stack-style way (some sort of recursion and backtracking style), but I'm not to a working solution yet.

I'm at a loss on to how exactly start on this. If somebody could give me some pointers on how to tackle this problem I'd appreciate it. When I find something I'll add it here.

Note: I noticed there are a lot of similar questions, however the solutions all seem to be specific to the particular problem and I think not applicable to my problem. If however I'm wrong, and you remember a previously answered question that can solve this, could you then tell me? Thanks in advance.

Update: Just to clarify if it was unclear. Every line in code is possible input. So this whole line is input:

[I am/I'm] [[un/]certain/[/un]sure].
tshepang
  • 12,111
  • 21
  • 91
  • 136
Floris Devriendt
  • 2,044
  • 4
  • 24
  • 34
  • 1
    **Step 1** : Match all outer `[]`-pairs, you can do this using regex. In C#, you need to use balancing groups, I used [this answer](http://stackoverflow.com/a/7899205) to your needs. So [here](http://pastebin.com/Dh6cd8FP)'s the code, you might test it on [regexhero.net](http://regexhero.net/tester/). Read more about [balancing groups](http://stackoverflow.com/a/17004406) **Step 2** : ??? **Step 3** : Profit !!! – HamZa Jan 28 '14 at 12:02
  • Are you saying that `[I am/I'm] [[un/]certain/[/un]sure].` is the input? – flindeberg Jan 28 '14 at 12:10
  • @HamZa Thanks, that's definitely helpful! – Floris Devriendt Jan 28 '14 at 16:58
  • @flindeberg yes, that is a possible input. Normally the inputs are simple, easy and straightforward. The one given here is a little more complex. – Floris Devriendt Jan 28 '14 at 16:58
  • @FlorisDevriendt I was just unsure of what your actual task was, it was not really clear what an example input could be. – flindeberg Jan 28 '14 at 17:17
  • @flindeberg, I updated the question to clarify it a bit. – Floris Devriendt Jan 28 '14 at 18:32

2 Answers2

2

This should work. I didn't bother optimizing it or doing error checking (in case the input string is malformed).

class Program
{
    static IEnumerable<string> Parts(string input, out int i)
    {
        var list = new List<string>();
        int level = 1, start = 1;
        i = 1;
        for (; i < input.Length && level > 0; i++)
        {
            if (input[i] == '[')
                level++;
            else if (input[i] == ']')
                level--;

            if (input[i] == '/' && level == 1 || input[i] == ']' && level == 0)
            {
                if (start == i)
                    list.Add(string.Empty);
                else
                    list.Add(input.Substring(start, i - start));
                start = i + 1;
            }
        }
        return list;
    }

    static IEnumerable<string> Combinations(string input, string current = "")
    {
        if (input == string.Empty)
        {
            if (current.Contains('['))
                return Combinations(current, string.Empty);
            return new List<string> { current };
        }
        else if (input[0] == '[')
        {
            int end;
            var parts = Parts(input, out end);
            return parts.SelectMany(x => Combinations(input.Substring(end, input.Length - end), current + x)).ToList();
        }
        else
            return Combinations(input.Substring(1, input.Length - 1), current + input[0]);
    }

    static void Main(string[] args)
    {
        string s = "[I am/I'm] [[un/]certain/[/un]sure].";
        var list = Combinations(s);
    }
}
svinja
  • 5,495
  • 5
  • 25
  • 43
  • That definitely works! Thank you! The only "problem" I had was when for input "[/something]" was given. The answers then were " " and "something" while it should be "" and "something". However I'll try adding that myself, gives me a way to learn your code as it uses some constructs I'm not used too (like the SelectMany linq expression etc). Anyway, thanks for the answer, I'll mark it as accepted as it does exactly what the question asked. :) – Floris Devriendt Jan 28 '14 at 17:02
1

You should create a parser that read character by character and builds up a logical tree of the sentence. When you have the tree it is easy to generate all possible combinations. There are several lexical parsers available that you could use, for example ANTLR: http://programming-pages.com/2012/06/28/antlr-with-c-a-simple-grammar/

Håkan Fahlstedt
  • 2,040
  • 13
  • 17
  • Even though I accepted the answer of svinja, I want to thank you for your answer as it is also helpful. I'll keep it in mind for future works! – Floris Devriendt Jan 28 '14 at 17:03