2

Is it possible in C# to turn a bool into a byte or int (or any integral type, really) without branching?

In other words, this is not good enough:

var myInt = myBool ? 1 : 0;

We might say we want to reinterpret a bool as the underlying byte, preferably in as few instructions as possible. The purpose is to avoid branch prediction fails as seen here.

Timo
  • 7,992
  • 4
  • 49
  • 67
  • I will add that a 2-entry `Dictionary` feels like overkill. Hash-based lookup seems on the heavy side for what we are trying to do. – Timo Nov 19 '19 at 13:18
  • _"The purpose is to avoid branch prediction fails as seen here"_ -- do you have any evidence whatsoever that you actually _need_ to do that? I.e. you have a _measured_ performance problem, which you have specifically confirmed is 100% caused by the branch-prediction issue described? Bottom line: you can cast using unsafe code, but you'd have to do it inline, otherwise the method call may cause as much overhead than a missed branch, and even then you still have to deal with the problem that there's no guarantee about what value is stored in a `bool`. You can't rely on `1` for a `true` value. – Peter Duniho Nov 19 '19 at 19:49
  • @PeterDuniho Indeed, I'm making no assumptions about `true` values other than that they are non-zero. As for measurements: The linked question and its accepted answer cover that well, in my opinion. Otherwise, you may consider this a theoretical exercise, to be measured whenever it is applied in a particular scenario. Let's say that I want to have a solution available when the branch prediction failure is determined to be an issue. And yes, that requires something inline that is very efficient! – Timo Nov 20 '19 at 12:16
  • _"that requires something inline that is very efficient"_ -- not really. Most branch-prediction issues can be addressed by restructuring the data. It seems to me that you're at least two steps ahead of the horse with your cart (there's no actual problem, and you've already decided what solution you think is required, even though you don't have a problem to measure). :) That said, I've addressed what is the simplest efficient way to avoid the `bool` test, in my answer below. – Peter Duniho Nov 20 '19 at 16:46
  • @PeterDuniho Although the suggestion to take a different route altogether is good and welcome, I think it's nice to have room to solve interesting problems, even if they can be circumvented in some or all circumstances. – Timo Nov 21 '19 at 16:03

4 Answers4

5
unsafe
{
     byte myByte = *(byte*)&myBool;   
}

Another option is System.Runtime.CompilerServices.Unsafe, which requires a NuGet package on non-Core platforms:

byte myByte = Unsafe.As<bool, byte>(ref myBool);

The CLI specification only defines false as 0 and true as anything except 0 , so technically speaking this might not work as expected on all platforms. However, as far as I know the C# compiler also makes the assumption that there are only two values for bool, so in practice I would expect it to work outside of mostly academic cases.

kalimag
  • 1,139
  • 2
  • 6
  • 11
  • 1
    Out of curiosity, can `Unsafe.As` be used without declaring our method and project as unsafe? – Timo Nov 20 '19 at 12:49
  • 1
    @Timo Yes, since it doesn't involve any pointer types (some other methods on `S.R.CS.Unsafe` do). However, just because it doesn't require the "unsafe" compiler flag doesn't make it *safe*, it's still unverifiable code that could be misused to violate memory safety. The same goes for all the other answers that have been posted to this question so far. – kalimag Nov 20 '19 at 13:11
  • Perfect. That's good to know. Actually, I believe using `MemoryMarshal.AsBytes` on a `ReadOnlySpan` cannot be misused to violate memory safety, as the elements are of the exact same size and cannot be mutated! – Timo Nov 20 '19 at 13:33
  • @Timo `MemoryMarshal.CreateReadOnlySpan` doesn't verify the `length` parameter and could create good old buffer overflows, however. – kalimag Nov 20 '19 at 13:39
  • Ok, ok, if one is foolish enough to pass it >1 length, you're right. :p – Timo Nov 20 '19 at 13:44
  • @Timo I didn't mean to sound pedantic, it's just that I think one should carefully consider the motivations for avoiding the `unsafe` keyword. The issue isn't having to check the checkbox, the issue is gaining the ability to screw things up without the language and runtime catching it. – kalimag Nov 20 '19 at 14:00
  • Understood. I think enabling the unsafe checkbox potentially opens up a whole range of dangers, so it's pleasant to avoid it and at least minimize the risk of violating memory safety. – Timo Nov 20 '19 at 14:07
  • 1
    Judging from the generated assembly, the solutions in this answer are the fastest. Next comes [`MemoryMarshal`](https://stackoverflow.com/a/58934997/543814), then [`StructLayout`](https://stackoverflow.com/a/58942062/543814), and finally the naive [branch](https://stackoverflow.com/q/58934996/543814). – Timo Jan 31 '22 at 16:20
2

The usual C# equivalent to "reinterpret cast" is to define a struct with fields of the types you want to reinterpret. That approach works fine in most cases. In your case, that would look like this:

[StructLayout(LayoutKind.Explicit)]
struct BoolByte
{
    [FieldOffset(0)]
    public bool Bool;
    [FieldOffset(0)]
    public byte Byte;
}

Then you can do something like this:

BoolByte bb = new BoolByte();
bb.Bool = true;
int myInt = bb.Byte;

Note that you only have to initialize the variable once, then you can set Bool and retrieve Byte as often as you like. This should perform as well or better than any approach involving unsafe code, calling methods, etc., especially with respect to addressing any branch-prediction issues.

It's important to point out that if you can read a bool as a byte, then of course anyone can write a bool as a byte, and the actual int value of the bool when it's true may or may not be 1. It technically could be any non-zero value.

All that said, this will make the code a lot harder to maintain. Both because of the lack of guarantees of what a true value actually looks like, and just because of the added complexity. It would be extremely rare to run into a real-world scenario that suffers from the missed branch-prediction issue you're asking about. Even if you had a legitimate real-world example, it's arguable that it would be better solved some other way. The exact alternative would depend on the specific real-world example, but one example might be to keep the data organized in a way that allows for batch processing on a given condition instead of testing for each element.

I strongly advise against doing something like this, until you have a demonstrated, reproducible real-world problem, and have exhausted other more idiomatic and maintainable options.

Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • Agreed on the caveat, but a very interesting solution nonetheless. I'm familiar with `StructLayout(LayoutKind.Explicit)`, but I hadn't thought of it as a solution to this particular problem. :) It's a nice one if we have control of the initial booleans. If not, then we still have to copy them to this struct, which is a pity, but it does eliminate branching! – Timo Nov 20 '19 at 12:08
  • Agreed (and aware) that false is 0 and true is non-zero (not 1 per se). – Timo Nov 20 '19 at 12:08
  • 2
    Maybe this is more usable with `Unsafe.As(ref myBool);`? – Bruno Zell Dec 28 '19 at 00:25
  • @BrunoZell I really like that! That suddenly makes this an actual contender against my own solution. Particularly not having to reinterpret _twice_ or performing an indexing operation _feels_ better... although it might just be all the same, performance-wise. Would be interesting to benchmark. As a downside, we need to have the struct definition somewhere, rather than relying purely on System types. – Timo Mar 05 '20 at 11:44
1

Here is a solution that takes more lines (and presumably more instructions) than I would like, but that actually solves the problem directly, i.e. by reinterpreting.

Since .NET Core 2.1, we have some reinterpret methods available in MemoryMarshal. We can treat our bool as a ReadOnlySpan<bool>, which in turn we can treat as a ReadOnlySpan<byte>. From there, it is trivial to read the single byte value.

var myBool = true;
var myBoolSpan = MemoryMarshal.CreateReadOnlySpan(ref myBool, length: 1);
var myByteSpan = MemoryMarshal.AsBytes(myBoolSpan);
var myByte = myByteSpan[0]; // =1
Timo
  • 7,992
  • 4
  • 49
  • 67
0

maybe this would work? (source of the idea)

using System;
using System.Reflection.Emit;

namespace ConsoleApp10
{
    class Program
    {
        static Func<bool, int> BoolToInt;
        static Func<bool, byte> BoolToByte;

        static void Main(string[] args)
        {
            InitIL();

            Console.WriteLine(BoolToInt(true));
            Console.WriteLine(BoolToInt(false));
            Console.WriteLine(BoolToByte(true));
            Console.WriteLine(BoolToByte(false));

            Console.ReadLine();
        }

        static void InitIL()
        {
            var methodBoolToInt = new DynamicMethod("BoolToInt", typeof(int), new Type[] { typeof(bool) });
            var ilBoolToInt = methodBoolToInt.GetILGenerator();
            ilBoolToInt.Emit(OpCodes.Ldarg_0);
            ilBoolToInt.Emit(OpCodes.Ldc_I4_0); //these 2 lines
            ilBoolToInt.Emit(OpCodes.Cgt_Un); //might not be needed
            ilBoolToInt.Emit(OpCodes.Ret);

            BoolToInt = (Func<bool, int>)methodBoolToInt.CreateDelegate(typeof(Func<bool, int>));

            var methodBoolToByte = new DynamicMethod("BoolToByte", typeof(byte), new Type[] { typeof(bool) });
            var ilBoolToByte = methodBoolToByte.GetILGenerator();
            ilBoolToByte.Emit(OpCodes.Ldarg_0);
            ilBoolToByte.Emit(OpCodes.Ldc_I4_0); //these 2 lines
            ilBoolToByte.Emit(OpCodes.Cgt_Un);  //might not be needed
            ilBoolToByte.Emit(OpCodes.Ret);

            BoolToByte = (Func<bool, byte>)methodBoolToByte.CreateDelegate(typeof(Func<bool, byte>));

        }
    }
}

based on microsoft documentation of each emit.

  1. load the parameter in memory (the boolean)
  2. load in memory a value of int = 0
  3. compare if any the parameter is greater than the value (branching here maybe?)
  4. return 1 if true else 0

line 2 and 3 can be removed but the return value could be something else than 0 / 1

like i said in the beginning this code is taken from another response, this seem to be working yes but it seem slow while being benchmarking, lookup .net DynamicMethod slow to find way to make it "faster"

you could maybe use the .GetHashCode of the boolean?

true will return int of 1 and false 0

you can then var myByte = (byte)bool.GetHashCode();

Fredou
  • 19,848
  • 10
  • 58
  • 113
  • I like this idea! But alas, I've check the reference source: the implementation is `return (m_value)?True:False;`. – Timo Nov 19 '19 at 15:14
  • @Timo, i have updated my answer with an alternative solution :-) – Fredou Nov 19 '19 at 18:00
  • Your new solution made me chuckle. That is extravagant! And indeed it solves the problem! Could you add a little more explanation on what the opcodes in question do and why you are using these ones? – Timo Nov 20 '19 at 12:11
  • I can't seem to find a definitive answer, but it seems likely that `OpCodes.Cgt_Un` branches. :( – Timo Nov 21 '19 at 16:08