Custom 4 bit data type in C#

Question

I want to create a custom data type which is 4 bits (nibble).

One option is this -

byte source = 0xAD;
var hiNybble = (source & 0xF0) >> 4; //Left hand nybble = A
var loNyblle = (source & 0x0F);      //Right hand nybble = D

However, I want to store each 4 bits in an array.

So for example, a 2 byte array,

00001111 01010000

would be stored in the custom data type array as 4 nibbles -

Essentially I want to operate on 4 bit types.

Is there any way I can convert the array of bytes into array of nibbles?

Appreciate an example.

Thanks.

The answer is in your question. There is no intrinsic type for a nibble. Use masks and shifts to go from using all 8 bits to 4-bit bytes. I would ask, though, why are you trying to do this? — Mitch, Jan 17 '21 at 03:53
Keep in mind, you can manipulate nibbles even when they are stored in bytes. For example, to change `byte x = 0xc7;` to `c3`, you can do `x = (x & 0xf0) | 0x3;` or `x &= 0xf0; x |= 0x3;` — Mitch, Jan 17 '21 at 03:57
Thats correct; however I want to reduce those bit manipulations and directly have a 4 bit array. — Dave Henry, Jan 17 '21 at 04:01
Just encapsulate a byte array! You are going to waste at most 4 bits of space, and that's not much. — Sweeper, Jan 17 '21 at 04:02
Yeah, but why? What are you trying to do? (I’m going for an X-Y problem) — Mitch, Jan 17 '21 at 04:02
Because the system from which I am receiving data as a stream outputs 4 bits at a time. — Dave Henry, Jan 17 '21 at 04:05
Byte array will be stored on the stack, that's 8 bytes for the pointer, 8 bytes for the header, 2 bytes for data, 2 (maybe 4) bytes padding. I count at least 19 extra bytes. Bit shift on a modern processor is less than a single clock cycle. I would encapsulate the high and low as properties — Charlieface, Jan 17 '21 at 04:10
Ok. If this is a thing of reading, then a stream abstraction which converts from four bit packed would probably be best. Do the conversation there, so the calling code does not need to do bit math. — Mitch, Jan 17 '21 at 04:12
@charlieface, agreed. Especially if the JIT generates nibble sized register accesses. Then mask+shift is completely “free” from a cycle and code size perspective. On the other hand, unless profiling shows this to be a hot path, I am not going to worry about it – whatever route make bugs less likely is probably the best path. — Mitch, Jan 17 '21 at 04:17
Storing in an array would be better because I want to take 4*3 = 12 bits at a time.. — Dave Henry, Jan 17 '21 at 04:21
Then you need two bytes and a some bit twiddling, an array is just stupid. @Mitch I don't believe there is direct nibble access in x86/x64. But the processor is not stupid and the pipeline will sort out the shift into a direct access I imagine. Either way, this is **at the very most** a single clock cycle extra, whereas a byte array on the heap, I'll leave to your imagination, GC ain't cheap, is not deterministic and can stop the world in many circumstances. — Charlieface, Jan 17 '21 at 04:22
the smallest unit on a normal PC is a byte. You can't do any atomically/automatically with sizes smaller than that. In short, you can't have a type smaller than byte. Even bool must be a byte long — phuclv, Jan 17 '21 at 04:34
Then use an array of `Int16` and waste four bits. Use the smallest data size larger than you are trying to consume. If it were 24 bits, use 32, etc... — Mitch, Jan 17 '21 at 04:36
@Charlieface, good point: re no-nibble register access in x86 & AMD64. I don't know what I was thinking, but hopefully that is the one time this year I need to pull up the AMD64 Programmer's manual :) — Mitch, Jan 17 '21 at 05:25
You can make all manner of wonderful efficient structures to solve your problem depending on the nature of the problem you want to solve and use cases , however there is not much to solve atm, the scope of your what you want to do is somewhat limited and lacking. — TheGeneral, Jan 17 '21 at 05:37
[Create a 4-bit type called Nybble : Variable Definition « Language Basics « C#](http://www.java2s.com/Code/CSharp/Language-Basics/Createa4bittypecalledNybble.htm) — , Jan 17 '21 at 06:02
Does this answer your question? [C# 4 bit data type](https://stackoverflow.com/questions/42075537/c-sharp-4-bit-data-type) — , Jan 17 '21 at 06:03

Mitch · Answer 1 · 2021-01-17T05:11:07.887

You can encapsulate a stream returning 4-bit samples by reading then converting (written from a phone without a compiler to test. Expect typos and off-by-one errors):

public static int ReadNibbles(this Stream s, byte[] data, int offset, int count)
{
    if (s == null)
    {
        throw new ArgumentNullException(nameof(s));
    }
    if (data == null)
    {
        throw new ArgumentNullException(nameof(data));
    }
    if (data.Length < offset + length)
    {
        throw new ArgumentOutOfRangeException(nameof(length));
    }

    var readBytes = s.Read(data, offset, length / 2);
    for (int n = readBytes * 2 - 1, k = readBytes - 1; k >= 0; k--)
    {
        data[offset + n--] = data[offset + k] & 0xf;
        data[offset + n--] = data[offset + k] >> 4;
    }
    return readBytes * 2;
}

To do the same for 12-bit integers (assuming MSB nibble ordering):

public static int Read(this Stream stream, ushort[] data, int offset, int length)
{
    if (stream == null)
    {
        throw new ArgumentNullException(nameof(stream));
    }
    if (data == null)
    {
        throw new ArgumentNullException(nameof(data));
    }
    if (data.Length < offset + length)
    {
        throw new ArgumentOutOfRangeException(nameof(length));
    }
    if (length < 2)
    {
        throw new ArgumentOutOfRangeException(nameof(length), "Cannot read fewer than two samples at a time");
    }
        
    // we always need a multiple of two
    length -= length % 2;

    // 3 bytes     length samples
    // --------- * -------------- = N bytes
    // 2 samples         1
    int rawCount = (length / 2) * 3;

    // This will place GC load.  Something like a buffer pool or keeping
    // the allocation as a field on the reader would be a good idea.
    var rawData = new byte[rawCount];
    int readBytes = 0;
    // if the underlying stream returns an even number of bytes, we will need to try again
    while (readBytes < data.Length)
    {
        int c = stream.Read(rawData, readBytes, rawCount - readBytes);
        if (c <= 0)
        {
            // End of stream
            break;
        }
        readBytes += c;
    }

    // unpack
    int k = 0;
    for (int i = 0; i < readBytes; i += 3)
    {
        // EOF in second byte is undefined
        if (i + 1 >= readBytes)
        {
            throw new InvalidOperationException("Unexpected EOF");
        }

        data[(k++) + offset] = (ushort)((rawData[i + 0] << 4) | (rawData[i + 1] >> 4));

        // EOF in third byte == only one sample
        if (i + 2 < readBytes)
        {
            data[(k++) + offset] = (ushort)(((rawData[i + 1] & 0xf) << 8) | rawData[i + 2]);
        }
    }
    return k;
}

score 0 · Answer 2 · answered Jan 19 '21 at 03:25

The best way to do this would be to look at the source for one of the existing integral data types. For example Int16.

If you look a that type, you can see that it implements a handful of interfaces:

[Serializable]
public struct Int16 : IComparable, IFormattable, IConvertible, IComparable<short>, IEquatable<short> { /* ... */ }

The implementation of the type isn't very complicated. It has a MaxValue a MinValue, a couple of CompareTo overloads, a couple of Equals overloads, the System.Object overrides (GetHashCode, GetType, ToString (plus some overloads)), a handful of Parse and ToParse overloads and a range of IConvertible implementations.

In other places, you can find things like arithmetic, comparison and conversion operators.

BUT:

What System.Int16 has that you can't have is this:

internal short m_value;

That's a native type (16-bit integer) member that holds the value. There is no 4-bit native type. The best you are going to be able to do is have a native byte in your implementation that will hold the value. You can write accessors that constrain it to the lower 4 bits, but you can't do much more than that. If someone creates a Nibble array, it will be implemented as an array of those values. As far as I know, there's no way to inject your implementation into that array. Similarly, if someone creates some other collection (e.g., List<Nibble>), then the collection will be of instances of your type, each of which will take up 8 bits.

However

You can create specialized collection classes, NibbleArray, NibbleList, etc. C#'s syntax allows you to provide your own collection initialization implementation for a collection, your own indexing method, etc.

So, if someone does something like this:

var nyblArray = new NibbleArray(32);
nyblArray[4] = 0xd;

Then your code can, under the covers, create a 16-element byte array, set the low nibble of the third byte to 0xd.

Similarly, you can implement code to allow:

var newArray = new NibbleArray { 0x1, 0x3, 0x5, 0x7, 0x9, 0xa};

or

var nyblList = new NibbleList { 0x0, 0x2, 0xe};

A normal array will waste space, but your specialized collection classes will do what you are talking about (with the expense of some bit-twizzling).

score -1 · Answer 3 · answered Jan 17 '21 at 04:34

The closest you can get to what you want is to use an indexer:

// Indexer declaration
public int this[int index]
{
    // get and set accessors
}

Within the body of the indexer you can translate the index to the actual byte that contains your 4 bits.

The next thing you can do is operator overloading. You can redefine +, -, *...

Custom 4 bit data type in C#

3 Answers3