0

Suppose I've two bit strings: runs and toggler, where a run is a group of contiguous like bits. Both of these bit strings can have an arbitrary arrangement of 1s and 0s (on and off respectively). For the sake of the question, I will use the example values below:

runs   : 1111011010100011
toggler: 1010001010010110

Does there exist a way to mask the two bit strings, or otherwise utilise any features of c++ outside of iteration (though the more general / language-independent, the better), to produce a result bit string that contains every run of 1s in runs which possesses at least one bit who has a corresponding 1 in toggler? A worked example of this using the example values provided can be seen as follows:

runs   : 1111011010100011
toggler: 1010001010010110
result : 1111011010000011

Where the first, second, third, and fourth runs of 1s in runs all possess at least one 1 corresponding to their constituent bits in toggler.

So far, I have that it is apparent that the positions of some of result's 0s can be identified, being the bits corresponding to ~runs. It is also apparent that the positions of some of result's 1s can be identified as runs & toggler. Given this information, any remaining unknown bits (equivalent to the bits that satisfy the condition runs & ~toggler) can be determined to be 0 iff the bits at either end of that run of unknown bits are zero. This can once again be seen below in the bit string unknown:

runs   : 1111011010100011
toggler: 1010001010010110
unknown: 1_1_0_1010_0001_ // 1 = runs & toggler, 0 = ~runs, _(unknown) = runs & ~toggler
result : 1111011010000011
  • None that come to mind. Understand, all will be iterative at some level. E.g. even `std::transform`, or other utilities iterate over the contents of the container. There is no built in *Shift & Mask* similar to how you would handle bitwise operations on fundamental types. Is that what you are asking for? – David C. Rankin Mar 30 '20 at 22:07
  • @DavidC.Rankin Thank you for your quick response. An iterative approach was the first solution that I had in mind, but given the potential drawback of having large groups of unknown bits with few toggles corresponding to them, as well as an arbitrary number of bits in the string, I was looking for alternative (non-iterative) solutions that I may have not yet considered, or a known solution I may not yet have knowledge of. – newbRaymond Mar 30 '20 at 22:24
  • You can shorten things significantly by using the [std::basic_string](https://en.cppreference.com/w/cpp/string/basic_string) member functions like `substr`, and `find` or `find_first_of` to step down the string locating the substing of `1`s in one string and then making a quick check if a `1` appears in that range of the other. The logic shouldn't be too bad, but it will be a roll-your-own type deal. – David C. Rankin Mar 30 '20 at 22:28

1 Answers1

2

It seems possible, but packed with annoying edge cases and some operations that aren't exactly "nice" even though they technically avoid iteration.

First the nice part. Getting, for each "group", a bit indicating whether any toggle was turned on for that group. An approach could be: take the toggles, put in a "blocker" 1 in the bit just after a group, and subtract the starting point of every group. Then if no toggle was set in a group, the "blocker" is reset by the borrow. Otherwise, if there is a toggle set, that toggle bit "eats" the borrow and the blocker survives. In code:

runs_first = runs & ~(runs << 1);
runs_after = ~runs & (runs << 1);
toggles_blocked = toggles | runs_after;
selected_groups = runs_after & (toggles_blocked - runs_first);

Example with your numbers (with zero pre-pended, to avoid an unfortunate edge case):

runs           : 01111011010100011
toggles        : 01010001010010110
runs_first     : 00001001010100001
runs_after     : 10000100101000100
toggles_blocked: 11010101111010110
difference     : 11001100100110101
selected_groups: 10000100100000100

If the groups were fixed-length, it would now be easy to expand those single-bit-flags to whole-group-masks.. or if the bit was located at the start of the group, it would also be easy. Reversing the bits gives a solution, using a subtraction trick:

rev_selected = reverse(selected_groups >> 1);
rev_runs = reverse(runs);
rev_runs_after = ~rev_runs & (rev_runs << 1);
rev_groupmask = (rev_runs_after - rev_selected) & rev_runs;
groupmask = reverse(rev_groupmask)

But even an "efficient reverse" isn't that efficient, unless there is direct hardware support for it (eg rbit on ARM, grevi on RISC-V with extension B).

harold
  • 61,398
  • 6
  • 86
  • 164