3

I have a 32bit random value (say 631).

0...0000001001110111

Each of these bits is a flag. I would like to return a random flag from these bits in, if at all possible, O(1) operation. How would I select the bit position 0, 1, 2, 4, 5, 6 or 9 (or it's corresponding value 1, 2, 4, 16, 32, 64, 512) from the given value 631? Preferably with as least possible bias towards some bits.

Things I came up with:

    • Shift value right random number of bits (max. 10 in this case)
    • See if LSB is set
      • If it is: got a bit position (last shifted number of bits); done
      • if not:
        • If resulting value == 0; start over
        • If resulting value != 0, go back to shifting random bits again

Above is not O(1) and possibly need multiple iterations if we happen to only 'hit' the 0 bits.

    • Mask (and) with random value
    • Repeat until power of 2 is left or start over when value is 0.

Again, above is not O(1) unfortunately.

I'm pretty sure this must be possible with some bithisfting/masking/magic somehow...


Edit:

As CodeCaster suggested; this will give me all set bit values:

int myvalue = 631;
var matchingmasks = Enumerable.Range(0, 32)
                              .Select(i => 1 << i)
                              .Where(i => (myvalue & i) == i)
                              .ToArray();

From the resulting array I can pick a random element and I'll have found my 'random' bit (flag) from the given value. However, this still requires a (hidden, because Linq) for-loop, 'brute-forcing' each possible bit, memory allocations for the resulting array etc.

Community
  • 1
  • 1
J. Doe
  • 31
  • 5
  • Any random generation algorithm (including the simplest one) will generate a random value in `O(1)` –  Feb 10 '16 at 13:27
  • @:deleted: I don't want the binary representation; please read the question. I want to pick a (one) bit from the set bits and either have it's position (e.g. bit 3) OR it's "value" (e.g. 4 for bit 3) @Kilanny: I don't want a random number; I want a random bit from a **given** number. – J. Doe Feb 10 '16 at 13:27
  • If you have the binary representation then you could manipulate that to get the bits, or the position of the bits. IF I understand correctly you have 631 want as the input and you want the output to be 1, 2, 4 , 16, 32 – Donald Jansen Feb 10 '16 at 13:29
  • @DonaldJansen: `you could manipulate that to get the bits`: yes; the question is **how**. (also: 64 and 512 are a possible output you left out for the given value of 631). – J. Doe Feb 10 '16 at 13:31
  • CodeCaster gave an answer which I had in mind of answering (not exact same code), I know I left those out ;) – Donald Jansen Feb 10 '16 at 13:32
  • Using Regex you can use `return from Match match in Regex.Matches(Convert.ToString(value, 2), "1") select match.Index;` inside a IEnumerable method then it will return all the positions – Donald Jansen Feb 10 '16 at 13:48
  • @DonaldJansen Why would I first want to convert to string, then use a regex? That's a horrible solution. Yes, it will probably work but the overhead involved is tremendous. [CodeCaster's solution](http://stackoverflow.com/a/35316576/5908487) is way more efficient although I'm not sure it's the fastest possible. – J. Doe Feb 10 '16 at 14:05
  • 1
    Also: avoid the temptation to say "that's not the fastest possible". The fastest possible is not your goal. You are probably unwilling to spend even a small amount like ten million dollars to develop custom hardware to solve this problem *as fast as possible*. Your goal is "fast enough given my budget and other constraints". Since we know neither your performance requirements nor your budget, we don't know what meets that goal. – Eric Lippert Feb 10 '16 at 15:02
  • @EricLippert: Both the algorithms I came up with (1 and 2 in the question) are (potentially) worse than O(1) since they may have to repeat once or even more than once (worst case infinity if you're in bad luck with the RNG) to get my desired output. I am/looking for better than that (which *should*, as I guessed, be O(1)). Hence the title/question. And with "as fast as possible" I mean "not converting to string and using a regex on it" and "preferrably no loops/ifs" and "if at all possible just some bitshifts/masks/magic". But that could've been clearer, agreed. – J. Doe Feb 10 '16 at 15:02
  • Ah, I misunderstood. Yes, you should definitely avoid any solution that is a "guess and test" that can execute for a long time given a run of bad luck. This is not so bad when there are only 32 choices, but you sometimes see questions where the code runs until one chance in a million is hit, and that is not fast. – Eric Lippert Feb 10 '16 at 15:04
  • Does your case involve random integers? If yes, then you may consider `Math.Log`. I'm not sure how that would affect performance. – Avenicci Feb 10 '16 at 18:59
  • The big O measures how the algorithm will scale when the number of bits will be increased. As I understand it, the number of bits is not going to increase, so the requirements can be either as fast as possible, or constant time, or both, but maybe not O(1). Eric Lippert proposed a solution in O(log Nbits), constant time, and fast (the multiplicative constant in front of big O is small - which is what really count when Nbits is small). – aka.nice Feb 11 '16 at 13:47

6 Answers6

6

First off, I would suggest doing this the easy, straightforward, obvious way that you suggest in the question: make an array of values, choose an element at random. Yes this allocates memory and whatnot. Optimize for the code being readable and correct first; only when you have a demonstrated performance problem should you optimize it.

If you do want to optimize it down to bit twiddling, this page is my go-to resource: http://graphics.stanford.edu/~seander/bithacks.html

The algorithms you'll want here are:

  • first, pick your favourite algorithm for determining the Hamming weight -- that is, "how many bits are on?" Call that number n.
  • Now pick a random number r from 1 to n
  • Now read the algorithm called "select the bit position with the given count". This takes your number r and gives you the bit position of the rth true bit starting from the high end. The code given on the page is for longs; it should be straightforward to modify it for ints.

I note that a key feature of many of these algorithms is that they are branchless. When you are trying to wring the last ounce of performance out of an algorithm, remember that every "if" kills performance. An "if" means that there is code in the cache that is NOT running, because you branched away from it, and therefore you are making a cache miss more likely. An "if" means there is an opportunity for the branch predictor to make the wrong choice. At the CLR level, every "if" means more basic blocks, which means more work for the jitter to do its flow analysis. And so on.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • `Optimize for the code being readable and correct first; only when you have a demonstrated performance problem should you optimize it.` I agree wholeheartedly and I live by this motto. However, I am *at* the step where I have demonstrated that the current code (which is currently a "brute force" implementation) needs some more work and I am trying to optimize it with bit twiddling because of the exact reasons you mentioned in your last paragraph/sentences. This code is (going to be) called many (tens of) thousands of times p/sec. so I am prepared to trade-off some readability. – J. Doe Feb 10 '16 at 15:21
  • I'll go visit that page and write an implementation and post it in my question when done for (possible) feedback. – J. Doe Feb 10 '16 at 15:22
  • Hmmm might be a while before I understand [Select the bit position (from the most-significant bit) with the given count (rank)](https://graphics.stanford.edu/~seander/bithacks.html#SelectPosFromMSBRank) and have it rewritten for 32 bit (currently can't even get it to work correctly in 64bit as-is...) – J. Doe Feb 10 '16 at 16:08
3

You can simply create the masks on beforehand, and select the masks that match the source value:

uint source = 631;
uint[] masks = Enumerable.Range(0, 32).Select(i => (uint)1 << i).ToArray();
uint[] matchingMask = masks.Where(m => (m & source) == m).ToArray();

Now matchingMask contains the values that make up your source value, in this case: 1, 2, 4, 16, 32, 64, 512.

Then from matchingMask you can select a random element.

If you want the bit position instead, you could use the indexing Select() overload like this:

var matchingMask = masks.Select((m, i) => new { Index = i, Mask = m}) 
                        .Where(m => (m.Mask & source) == m.Mask)
                        .ToArray();
Community
  • 1
  • 1
CodeCaster
  • 147,647
  • 23
  • 218
  • 272
  • Is this `O(1)`, as question title indicates? –  Feb 10 '16 at 13:28
  • 4
    @Kilanny you have no collection contains `n` elements then it's always `O(1)`. Or in other way it is not correct question. – Vadim Martynov Feb 10 '16 at 13:30
  • @Kilanny I don't really think this can be expressed in big O notation, but feel free to enlighten me. – CodeCaster Feb 10 '16 at 13:31
  • That is certainly a possibility; however performance wise this won't perform very well if I have a crapton of values and I want to select a random flag from each of these values. We can debate if this is O(1) (technically it probably is) but it is a lot of 'work' to get a random bit from the given value. I would prefer some bitshift/mask/magic instead of your (granted, correct from what I can see) solution. – J. Doe Feb 10 '16 at 13:34
  • Why `Math.Pow(2, i)` and not `1 << i`? – Matthew Watson Feb 10 '16 at 13:37
1

This is actually sort-of possible. There's a 64-bit solution here.

I've converted it to the following C# code. It's O(1) because the number of operations is not dependent on the number of bits set:

public static uint SelectRandomSetBit(ulong v, Random rng)
{
    ulong a = v - ((v >> 1) & ~0UL / 3);
    ulong b = (a & ~0UL / 5) + ((a >> 2) & ~0UL / 5);
    ulong c = (b + (b >> 4)) & ~0UL / 0x11;
    ulong d = (c + (c >> 8)) & ~0UL / 0x101;
    ulong t = ((d >> 32) + (d >> 48));
    int   n = (int)((d * (~(ulong)0 / 255)) >> (64 - 1) * 8);
    ulong r = (uint) rng.Next(1, n+1);
    ulong s = 64;

    s -= ((t - r) & 256) >> 3;
    r -= (t & ((t - r) >> 8));
    t = (d >> (int)(s - 16)) & 0xff;
    s -= ((t - r) & 256) >> 4;
    r -= (t & ((t - r) >> 8));
    t = (c >> (int)(s - 8)) & 0xf;
    s -= ((t - r) & 256) >> 5;
    r -= (t & ((t - r) >> 8));
    t = (b >> (int)(s - 4)) & 0x7;
    s -= ((t - r) & 256) >> 6;
    r -= (t & ((t - r) >> 8));
    t = (a >> (int)(s - 2)) & 0x3;
    s -= ((t - r) & 256) >> 7;
    r -= (t & ((t - r) >> 8));
    t = (v >> (int)(s - 1)) & 0x1;
    s -= ((t - r) & 256) >> 8;

    return (uint)(s-1);
}

Here's how I tested it:

Random rng = new Random();
ulong number = 0x0101010101010101;
int[] bits = new int[64];

for (int i = 0; i < 1000000; ++i)
    ++bits[SelectRandomSetBit(number, rng)];

for (int i = 0; i < 64; ++i)
    Console.WriteLine($"bit {i} was returned {bits[i]} times.");

You would expect to see every 8th bit returned approximately the same number of times, and none of the other bits returned. This is indeed what happens.

I leave converting this to 32 bit as an interesting exercise. ;)

(This is probably an unnecessary optimisation in any case: A simple loop to count the bits and then randomly choose one would probably be fast enough...)

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • "`ulong number = 0x0101010101010101;`" [...] "`You would expect to see all the even-numbered bits returned approximately the same number of times`"; that's weird 'cause `0x0101...101`... is a **hexadecimal** number (not binary) and I'd expect every 8th bit set. Try `ulong number = Convert.ToUInt64("0101010101010101", 2);`. Other than that it seems to work. Will try to get some (benchmark) numbers soon. Am I correct in thinking that `n` is the "Hamming Weight" of `v`? My second argument to the function should be `myvalue` (631), not an rng. So I need to pull out `n` too... – J. Doe Feb 11 '16 at 00:03
  • @J.Doe Sorry, I shall correct that! (I'd been running several tests, including one with the even bits set, but I posted the wrong sample.) Incidentally, the first argument is the number containing the bits (in your case 631), so I'm not sure why you say it should be the second argument. – Matthew Watson Feb 11 '16 at 08:34
  • "*Incidentally, the first argument is the number containing the bits (in your case 631), so I'm not sure why you say it should be the second argument*" My bad; I'm afraid I've been staring at code too long for today. You are absolutely correct and I have no idea why I came up with that. – J. Doe Feb 11 '16 at 09:09
0

This should be really & simply O(1):

byte b = 123;
Random r = new Random();
int bitNumber = r.Next(32);
var bit = (b & (1 << bitNumber-1)) != 0;

Check this

Community
  • 1
  • 1
  • OP wants to get the bit positions (or values) where the bit is set in the source value. Or: can you explain how this code does that? – CodeCaster Feb 10 '16 at 13:36
0

Seems like a homework problem... But, the solution is possible because you have only 32 bits to look for and 32 known-ahead of time values for each position. If I'm not mistaken, these are actually the same (a mask with the second bit set has the value "2" when interpreted as an integer).

What you do is build your 32 entry array with the prepared bit mask that will return only that bit.

The array lookup is O(1) since the speed is constant regardless of which bit you are retrieving. At this point you & and compare with the original mask as you would have done when using bit shifting and the final result will be O(1) still.

Note though that while this is O(1) it may not be faster than bit shifting. The arrays are 32*4 bytes of memory, so 128 bytes. That's not huge but it's not tiny either. You would need to run a simple test to confirm that executing up to 32 bit shift instructions takes more time than retrieving an item from the array (my guess is that the array is faster, but I could be wrong).

huntharo
  • 2,616
  • 21
  • 23
  • 1
    It's not homework. The *actual* problem is that I have an `enum` with a `Flags` attribute; I want to find a random flag from the set flags in a given value. Your solution works, technically, but unless I pick random elements from the pre-prepared array it will have a bias to either LSB or MSB or whatever order the array is in. And if I pick random elements from the array I would need to invoke a `Rng.Next()` every iteration, keep track of which elements have already been 'used' and which not etc. etc. and then I start wondering if some 'bittwiddling' could be faster. – J. Doe Feb 10 '16 at 13:42
0

What about a lookup table?

public static class RandomExtensions
{
    public static uint GetRandomBitOf( this Random rand, uint mask )
    {
        if( mask == 0 ) return 0;
        var lo = smLookup[mask & 0xFFFF];
        var hi = smLookup[mask >> 16];
        int i = rand.Next( lo.Length + hi.Length );
        return i < lo.Length ? (uint) lo[i] : (uint) hi[i - lo.Length] << 16;
    }

    static RandomExtensions()
    {
        smLookup = new ushort[65536][];

        for( int i = 0; i < smLookup.Length; ++i )
        {
            ushort j = (ushort) i;
            smLookup[i] = Enumerable
                .Range( 0, 16 )
                .Select( b => (ushort) ( 1 << b ) )
                .Where( b => ( j & b ) != 0 )
                .ToArray();
        }
    }

    private static ushort[][] smLookup;
}

I'm not sure where this ranks with regards to performance amongst the other answers. I'm just adding this answer mostly for completeness in terms of possible implementations.

Kyle
  • 6,500
  • 2
  • 31
  • 41