2

Apologies if this has been answered before but I can't come up with a good name to search for what I'm looking for. I have the potential for between 1-5 string variables (we'll call them A,B,C,D,E) that can have one of two values represented by 'P' and 'S'. These are for pluralized and singular word forms

The data will always be in the same order, ABCDE, so that is not a concern but it may not contain all five (could be only A, AB, ABC or ABCD). I'm looking for an algorithm that will handle that possibility while generating all potential plural/singular combinations. So in the case of a 5 variable string the results would be: SSSSS, SPSSS, SPPSS, SPSPS, ... PPPPP

I have the logic to pluralize and to store the data it's just a question of what is the logic that will generate all those combinations. If it matters, I am working in C#. Any help would be greatly appreciated!

Zoidfarb
  • 61
  • 8
  • 2
    are results of `` (empty), `1` (1 digit) and `11` (etc.) possible? do you consider result of `11` to be equal to `00011` or different? how about `01010` and `10001` - different or not? – quetzalcoatl Feb 03 '16 at 16:45
  • You can enumerate all integers from 0 to 2^5-1 (i.e. from 0 to 31 ) and represent each integer as binary number – Igor Bendrup Feb 03 '16 at 16:48
  • I tried to keep it fairly straight forward but perhaps more details will help. What I'm trying to do is pluralize/singularize every word in a potential search string of up to 5 words. So I need to search every combination of plural and singular words (hence 0 and 1 substituting for plural or singular). So there will never be a case where A is null and B is not. If someone submits "red bag" i need to search "red bag" "reds bag" "red bags" "reds bags" if that makes sense – Zoidfarb Feb 03 '16 at 16:49
  • And the reason I need to hit potentially ALL the combinations is because this particular search is a very strict 1:1 exact match search. – Zoidfarb Feb 03 '16 at 16:50
  • In that case it's probably worth reformulating the question with your exact requirements. – phant0m Feb 03 '16 at 16:51
  • Sure thing, I just wanted to avoid writing a book :) – Zoidfarb Feb 03 '16 at 16:52
  • Therefore you always need N items. If someone enters 3 words, you need to randomize for 3, not 2, not 1. You should focus on that. Now, go with what IgorBendrup suggested, but keep in mind that it's not 2^5-1 but 2^N-1 where N is the number of words, and that you need to represent that looping number as the last N bits. So if you have 32bit int variable that holds `7`(binary 00000000000000000000000000000111) and you have 4 words, then you look at final 0111 only. – quetzalcoatl Feb 03 '16 at 16:53
  • If you don't like bits, and if you are sure that it's always 1,2,3,4 or 5 words, you can IF/SWITCH on that wordcount and in case 1 - do a loop over 0,1; in case 2 - do a loop over 0,1 with inner loop over 0,1; in case of 3 - do a loop over 0,1 with inner loop over 0,1 with inner loop over 0,1 (and so on for 4 and 5), and the looping variables will mark you 0/1 states for word 1st,2nd,3rd, and so on. Simple. "Ugly". No bits. – quetzalcoatl Feb 03 '16 at 16:55
  • Possible duplicate of [Algorithm to generate all possible arrays of ones and zeros of a given length](http://stackoverflow.com/questions/4633584/algorithm-to-generate-all-possible-arrays-of-ones-and-zeros-of-a-given-length) (although different language, very similar) – quetzalcoatl Feb 03 '16 at 16:57

4 Answers4

2

So there are only two possible values, 0 and 1. Wait a minute... Zeroes and ones... Why does that sound familiar...? Ah, binary to the rescue!

Let's count a little in binary, starting with 0.

  • 0000 = 0
  • 0001 = 1
  • 0010 = 2
  • 0011 = 3
  • 0100 = 4
  • 0101 = 5
  • 0110 = 6
  • 0111 = 7
  • 1000 = 8
  • ...etc

If you look at the rightmost bit of the first two rows, we have all the possible combinations for 1 bit, 0 and 1.

If you then look at the two rightmost bits of the first four rows, you get all 2 bit combinations: 00, 01, 10 and 11.

The first eight rows have all three bit combinations, etc.

If you want all possible combinations of x bits, count all numbers from 0 to (2^x)-1 and look at the last x bits of the numbers written in binary.

(Likewise, if you instead have three possible values (0, 1 and 2), you can count between 0 and (3^x)-1 and look at the last x digits when written in ternary, and so on for all possible amounts of values.)

BambooleanLogic
  • 7,530
  • 3
  • 30
  • 56
0

I would advise a recursive algorithm. For example an algorithm like this could be the answer to your problem (I dont really know what returnvalues you exactly want)

public void getAllWords(ref List<string> result, string Prefix, int wordLength)
{
  if(wordLength == 0)
    result.add(prefix);
  else
  {
    getAllWords(result, prefix+"0", wordLength-1);
    getAllWords(result, prefix+"1", wordLength-1);
  }
}

to be called with

List<string> result = new List<string>();
getAllWords(result, "", 5);

I hope this works, I'm on mobile at the moment.

You can change that as you want to account for m a different alphabet (for example values 0,1,2..) as you like.

timschoen
  • 171
  • 2
  • 9
0

You can enumerate all integers from 0 to 2^5-1 (i.e. from 0 to 31 ) and represent each integer as bool[]. May be this will be helpful:

static bool[][] GetCombinations(int wordCount) {
    int length = (int) Math.Pow(2, wordCount);
    bool[][] res = new bool[length][];
    for (int i = 0; i < length; i++)
    {
        res [i] = new bool[wordCount];
        for (int j = 0; j < wordCount; j++) {
            res [i] [j] = ((i & (int)Math.Pow (2, j)) != 0);
        }
    }
    return res;
}
Igor Bendrup
  • 2,637
  • 2
  • 16
  • 15
0

"Recursive permutations C#" will do the trick for a google search. But I thought I'd attempt a solution for you using simple counting and bit masking. Here is some code that will do "binary" counting and, using bitshifting, determine if the position in the words should be pluralized (you mention you have those details already):

string input = "red bag";
string[] tokens = input.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);

string[] test = new string[tokens.Length];

int size = (int)Math.Pow(tokens.Length, 2);

for (int i = 0; i < size; i++)
{
    for (int j = 0; j < tokens.Length; j++)
    {
        int mask = (1 << j);
        if ((mask & i) != 0)
        {
            test[j] = Pluralize(tokens[j]);
        }
        else
        {
            test[j] = Singularize(tokens[j]);
        }
    }

    Console.WriteLine(string.Join(" ", test));
}

Output:

red bag
reds bag
red bags
reds bags
Balah
  • 2,530
  • 2
  • 16
  • 24