I am looking for the most efficient way to store array of small numbers 0-31 (exactly 5 bits each) and decode them back in C#. Should I use array of bytes or some bitarray? How to encode and decode it with minimum overhead? Input can be List of bytes where each byte has maximum value 31 (5 bits used only), so 8 numbers should be encoded to 5 bytes.
Asked
Active
Viewed 89 times
0
-
Have you seen [this](https://stackoverflow.com/a/55834344/1911064) related post? – Axel Kemper Aug 27 '22 at 15:51
-
I don't think it's the same. It sets uint into uint. I need to encode 5-bits numbers to array of bytes, so I need to split some numbers to store them to different bytes. – Jiri Hudecek Aug 27 '22 at 17:35
2 Answers
0
You could use something like this:
class SmallIntArray
{
const uint BitsPerInt = 5;
private BitArray[] b;
public SmallIntArray(int len)
{
b = new BitArray[BitsPerInt];
for (int pos = 0; pos < BitsPerInt; pos++)
{
b[pos] = new BitArray(len);
}
}
public byte[] Export()
// create a byte array to store away
{
BinaryFormatter formatter = new BinaryFormatter();
using (var ms = new MemoryStream())
{
formatter.Serialize(ms, b);
return ms.ToArray();
}
}
public void Import(byte[] bytes)
// read byte array into object
{
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
ms.Write(bytes, 0, bytes.Length);
ms.Seek(0, SeekOrigin.Begin);
b = (BitArray[])formatter.Deserialize(ms);
}
}
public int this[int idx]
{
get
{
int res = 0;
checkIndex(idx);
for (int pos = 0; pos < BitsPerInt; pos++)
{
res = (res << 1) + (b[pos].Get(idx) ? 1 : 0);
}
return res;
}
set
{
int i = value;
checkIndex(idx);
for (int pos = 0; pos < BitsPerInt; pos++)
{
b[pos].Set(idx, (i & 1) == 1);
i >>= 1;
}
// catch arguments with too many bits
Debug.Assert(i == 0);
}
}
private void checkIndex(int idx)
{
Debug.Assert((idx >= 0) && (idx < b[0].Count));
}
}
Example usage:
var sia = new SmallIntArray((int)10e6);
sia[1234] = 17;
var x = sia.Export();
var sia2 = new SmallIntArray(0);
sia2.Import(x);
Console.WriteLine($"{sia2[1234]}");
The measured memory consumption is near 5 bits per number.

Axel Kemper
- 10,544
- 2
- 31
- 54
-
I am not sure if this is suitable for my usecase. Let's say that I have byte[800] where every number takes only 5 bits. I would like to shrink it to byte[500] and save it to database as a binary object. – Jiri Hudecek Aug 27 '22 at 18:24
-
I added export/import functions which can be used to store the data in a database or re-use retrieved data from the database. – Axel Kemper Aug 28 '22 at 20:24
0
You could use a fairly simple unrolled loop, along with some bit-twiddling, to convert between these two formats.
I've assumed the arrays are exactly divisible by 8 and 5, otherwise you will need to devise a method to store the length.
public byte[] ConvertTo5Bit(byte[] array)
{
Debug.Assert(array.Length % 8 == 0);
var length = array.Length / 8 * 5;
var newArray = new byte[length];
for (var i = 0; i < array.Length; i += 8)
{
newArray[i] = array[i] | (array[i + 1] << 5);
newArray[i + 1] = (array[i + 1] >> 3) | (array[i + 2] << 2) | (array[i + 3] << 7);
newArray[i + 2] = (array[i + 3] >> 1) | (array[i + 4] << 4);
newArray[i + 3] = (array[i + 4] >> 4) | (array[i + 5] << 1) | (array[i + 6] << 6;
newArray[i + 4] = (array[i + 6] >> 2) | (array[i + 7] << 3);
}
return newArray;
}
public byte[] ConvertFrom5Bit(byte[] array)
{
Debug.Assert(array.Length % 5 == 0);
var length = array.Length / 5 * 8;
var newArray = new byte[length];
for (var i = 0; i < array.Length; i += 5)
{
newArray[i] = array[i] & 0x1F;
newArray[i + 1] = (array[i] >> 5) | (array[i + 1] << 3) & 0x1F
newArray[i + 2] = (array[i + 1] >> 2) & 0x1F;
newArray[i + 3] = (array[i + 1] >> 7) | (array[i + 2] << 1) & 0x1F;
newArray[i + 4] = (array[i + 2] >> 4) | (array[i + 3] << 4) & 0x1F;
newArray[i + 5] = (array[i + 3] >> 1) & 0x1F;
newArray[i + 6] = (array[i + 3] >> 6) | (array[i + 4] << 2 & 0x1F;
newArray[i + 7] = array[i + 4] >> 3;
}
return newArray;
}

Charlieface
- 52,284
- 6
- 19
- 43