0

I have a bunch of sets say S1, S2, S3, ... Each set have different elements. Say S1 = { A, B, C}, S2 = { X, Y }, S3 = { P, Q, R, T }

There exists a combination of these sets K = { S1, S2, S3 }. For example, an instance of this combination would be { A, X, P }. There are obviously 3 x 2 x 4 = 24 combinations possible. What I need is the "rank" of a particular combination, calculated using simple ordered enumeration from left to right, and the vice versa.

Obviously, I can calculate this easily by simply enumerating all the combinations and comparing it to the requested combination while keeping a counter, but I need an efficient algorithm as my sets can contain up to 20000 elements each, and number of combined sets for some case are > 10.

BTW, I am aware of the thread Compute rank of a combination? here in stack overflow. But, unfortunately, it is not applicable here as my combinations are made out of different sized sets for different positions

I would appreciate an implementation in C# but other languages or pseudo-code would also be very helpful.

Any suggestions, please

kemal

UPDATE: @spinning_plane & @aasmund. Thank you for the answers. They both provide me with the same formula for calculating the rank.

But, I also need the other way around. i.e. to get the combination for a given rank (zero based). For example, given the rank 0, the result would be {A,X,P} ,for 3 {A, X, R }, etc. Anyone with an algorithm, please?

Community
  • 1
  • 1
Kemal Erdogan
  • 1,060
  • 1
  • 9
  • 19

2 Answers2

4

Think of your set as a number whose possible values for each digit is the size of the associated set. To see this, supposing the size of each set S1...S3 is the same, say 2 for simplicity. To calculate the rank of the set, you would simply interpret K as a binary number and translate it to its base-10 equivalent. rank(x) is simply the 0 based index of the element in the set.

rank(A)*2^0 + rank(X)*2^1 + rank(P)*2^2

Now to generalize this to the case when the sets can be different size, we can write out an expression for the calculation

rank(A) + rank(X)*len(S1) + rank(P)*len(S2)*len(S1) ... etc.

In psuedo-code

input = {'a','b','x'}
output = 0;
cumulative = 1;
for i in range(len(K)):
     output += cumulative*rank(input[i],K[i]) # this returns the index of input[i] in set K[i]
     cumulative*=len(K[i])
dfb
  • 13,133
  • 2
  • 31
  • 52
  • Thanks again for this valuable help. I also very much appreciate the answer from Aasmund. But, I accept yours as you also provided pseudo-code for it – Kemal Erdogan May 30 '11 at 09:37
3

Is this what the full "ranked sequence" look like?

0: {A, X, P}
1: {A, X, Q}
2: {A, X, R}
3: {A, X, S}
4: {A, Y, P}
5: {A, Y, Q}
...

If so, let the sets be numbered from right to left as S1, S2, ..., Sn, and let the ranks of the chosen elements within their own sets (e.g. A=0, B=1, C=2) be r1, r2, ..., rn. The formula should then be

rn * |S(n-1)| * ... * |S2| * |S1| + ... + r3 * |S2| * |S1| + r2 * |S1| + r1

Why? Let's say that we pick {C, Y, Q}. Their zero-based ranks within their respective sets are 2, 1, and 2, respectively. Because the leftmost rank is 2, it means that in order to get to that part of the ranking sequence, we need to let the rightmost and middle positions perform two full "rounds", for a total of (in this case) r2 * |S2| * |S1| = 2 * 2 * 4 = 16 rows. Then, we must skip past 1 round of the rightmost position in order to reach Y, and so on.

Edit: The formula can be simplified to

(((...) * |S3| + r3) * |S2| + r2) * |S1| + r1

(and it certainly ought to be computed in that way). Watch out for integer overflow, by the way...

Aasmund Eldhuset
  • 37,289
  • 4
  • 68
  • 81