1

I have a process that returns bit strings of given length N and given popcount k, i.e., each bitstring has exactly k ones and N - k zeros. I need to counts how often each bit string is returned. Currently, I use the associated base2-integer to increment a counter in a 2^N-dimensional array. However, this becomes impractical for large N, since the array is too large. To do better, I would like to exploit the fact that I know the popcount k beforehand, so I would only need an array of dimensions M = N choose k to store the counts. To do this I need a procedure to turn a given bitstring to a unique integer, e.g. for the example N=3, k=2, the following mapping would suffice:

011 → 1
110 → 2
101 → 3

Note that I don't require the reverse mapping, although it should exist. What could be such a mapping? Is this (related to) a known problem?

David Zwicker
  • 23,581
  • 6
  • 62
  • 77
  • Is there a reason you need to use an array and not a hash table for example? – SaiBot Feb 23 '18 at 11:33
  • No, I don't need a particular data structure, but I need to be able to iterate over the counts of all bit strings in the end. I wouldn't know how to use a hash table for this example. If you know how to do this, I would certainly appreciate an answer. – David Zwicker Feb 23 '18 at 11:35
  • You would use a structure that can store and access pairs in constant time. The key is the bitstring and the value the counter. In Java you could use a HashMap, in python a Dictionary. Whenever you receive a bitstring you check if this key is already in the hashtable. If not insert it with counter 1. Otherwise increase the counter by 1. This way you will only store the existing bitstring counters. – SaiBot Feb 23 '18 at 11:42
  • Thanks for the hint that this is related to the "ranking/unranking of combinations". I found the solution in https://stackoverflow.com/a/3143594/932593, where the function `unchoose` does exactly what I want. @ גלעד ברקן: If you post this as an answer, I'll accept it. – David Zwicker Feb 23 '18 at 11:49

1 Answers1

2

As SaiBot suggested, you could use a hash set to map a unique bitstring to a value. Otherwise, you could look up "ranking permutations" as a way to number all permutations of (n,k) zeros and ones.

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61