how to encode 27 vector3's into a 0-256 value?

Question

I have 27 combinations of 3 values from -1 to 1 of type:

 Vector3(0,0,0);
 Vector3(-1,0,0);
 Vector3(0,-1,0);
 Vector3(0,0,-1);
 Vector3(-1,-1,0);
 ... up to
 Vector3(0,1,1);
 Vector3(1,1,1);

I need to convert them to and from a 8-bit sbyte / byte array.

One solution is to say the first digit, of the 256 = X the second digit is Y and the third is Z...

so

  Vector3(-1,1,1) becomes 022,
  Vector3(1,-1,-1) becomes 200,
  Vector3(1,0,1) becomes 212...

I'd prefer to encode it in a more compact way, perhaps using bytes (which I am clueless about), because the above solution uses a lot of multiplications and round functions to decode, do you have some suggestions please? the other option is to write 27 if conditions to write the Vector3 combination to an array, it seems inefficient.

Thanks to Evil Tak for the guidance, i changed the code a bit to add 0-1 values to the first bit, and to adapt it for unity3d:

function Pack4(x:int,y:int,z:int,w:int):sbyte {
var b: sbyte = 0;

b |= (x + 1) << 6; 
b |= (y + 1) << 4;
b |= (z + 1) << 2;
b |= (w + 1);   
return b;
}

function unPack4(b:sbyte):Vector4 {
var v : Vector4; 
v.x = ((b & 0xC0) >> 6) - 1;      //0xC0 == 1100 0000   
v.y = ((b & 0x30) >> 4) - 1;    // 0x30 == 0011 0000
v.z = ((b & 0xC) >> 2) - 1;     // 0xC  == 0000 1100
v.w = (b  & 0x3) - 1;            // 0x3  == 0000 0011
return v;
}

Spektre · Answer 1 · 2017-04-03T08:58:48.530

I assume your values are float not integer

so bit operations will not improve speed too much in comparison to conversion to integer type. So my bet using full range will be better. I would do this for 3D case:
```
8 bit -> 256 values
3D -> pow(256,1/3) = ~ 6.349 values per dimension
6^3 = 216 < 256
```
So packing of (x,y,z) looks like this:
```
BYTE p;
p =floor((x+1.0)*3.0);
p+=floor((y+1.0)*3.0*6.0);
p+=floor((y+1.0)*3.0*6.0*6.0);
```
The idea is convert <-1,+1> to range <0,1> hence the +1.0 and *3.0 instead of *6.0 and then just multiply to the correct place in final BYTE.

and unpacking of p looks like this:
```
x=p%6; x=(x/3.0)-1.0; p/=6;
y=p%6; y=(y/3.0)-1.0; p/=6;
z=p%6; z=(z/3.0)-1.0;
```
This way you use 216 from 256 values which is much better then just 2 bits (4 values). Your 4D case would look similar just use instead 3.0,6.0 different constant floor(pow(256,1/4))=4 so use 2.0,4.0 but beware case when p=256 or use 2 bits per dimension and bit approach like the accepted answer does.

If you need real speed you can optimize this to force float representation holding result of packet BYTE to specific exponent and extract mantissa bits as your packed BYTE directly. As the result will be <0,216> you can add any bigger number to it. see IEEE 754-1985 for details but you want the mantissa to align with your BYTE so if you add to p number like 2^23 then the lowest 8 bit of float should be your packed value directly (as MSB 1 is not present in mantissa) so no expensive conversion is needed.
In case you got just {-1,0,+1} instead of <-1,+1>

then of coarse you should use integer approach like bit packing with 2 bits per dimension or use LUT table of all 3^3 = 27 possibilities and pack entire vector in 5 bits.

The encoding would look like this:
```
int enc[3][3][3] = { 0,1,2, ... 24,25,26 };
p=enc[x+1][y+1][z+1];
```
And decoding:
```
int dec[27][3] = { {-1,-1,-1},.....,{+1,+1,+1} };
x=dec[p][0];
y=dec[p][1];
z=dec[p][2];
```
Which should be fast enough and if you got many vectors you can pack the p into each 5 bits ... to save even more memory space

Hi, that's a very interesting programming trick. Thanks a lot. The original question would be very complicated to say clearly, it involves a voxel space where the normals are derived from the free spaces around the voxels, and because there are so many samples, the program only needs 0/1 2-bit normals to say the where the cubes/samples are facing, which are added together by poisson disk vertex mapping afterwards. I have to upscale to 8-bit precision afterwards, because i can antialias the voxels by keeping only the jagged corner ones and averaging the samples around them from flat zones. — bandybabboon, Apr 03 '17 at 09:14
@comprehensible in that context BYTE representation make much more sense as you can use it for masked smoothing and stuff needed for not just anti-aliasing. I used it for this [How to procedurlly generate a Zelda like make in java](http://stackoverflow.com/a/36263999/2521214) for smoothing of tiled Voxel 3D terrain by masking the combinations of neighboring voxels. basically I convert 27 voxels cube into string and mask string compare with rules for task I perform like detect specific corner to place specific tile etc ... it really ease up the code so instead hundreds of `if else` i have few. — Spektre, Apr 03 '17 at 09:29

EvilTak · Accepted Answer · 2017-04-03T02:44:54.017

1

One way is to store the component of each vector in every 2 bits of a byte.

Converting a vector component value to and from the 2 bit stored form is as simple as adding and subtracting one, respectively.

-1 (1111 1111 as a signed byte) <-> 00 (in binary)
 0 (0000 0000 in binary)        <-> 01 (in binary)
 1 (0000 0001 in binary)        <-> 10 (in binary)

The packed 2 bit values can be stored in a byte in any order of your preference. I will use the following format: 00XXYYZZ where XX is the converted (packed) value of the X component, and so on. The 0s at the start aren't going to be used.

A vector will then be packed in a byte as follows:

byte Pack(Vector3<int> vector) {
    byte b = 0;
    b |= (vector.x + 1) << 4; 
    b |= (vector.y + 1) << 2;
    b |= (vector.z + 1);
    return b;
}

Unpacking a vector from its byte form will be as follows:

Vector3<int> Unpack(byte b) {
    Vector3<int> v = new Vector<int>();
    v.x = ((b & 0x30) >> 4) - 1;    // 0x30 == 0011 0000
    v.y = ((b & 0xC) >> 2) - 1;     // 0xC == 0000 1100
    v.z = (b & 0x3) - 1;     // 0x3 == 0000 0011
    return v;
}

Both the above methods assume that the input is valid, i.e. All components of vector in Pack are either -1, 0 or 1 and that all two-bit sections of b in Unpack have a (binary) value of either 00, 01 or 10.

Since this method uses bitwise operators, it is fast and efficient. If you wish to compress the data further, you could try using the 2 unused bits too, and convert every 3 two-bit elements processed to a vector.

edited Apr 03 '17 at 02:44

answered Apr 02 '17 at 14:17

EvilTak

7,091
27
36

very cool, Thank you, it is very difficult without sound knowledge of the topic. I changed the code a bit for vector4 with xyzw. – bandybabboon Apr 02 '17 at 18:02
ok i had an error to convert the vec4 version and i fixed it, cheers. – bandybabboon Apr 02 '17 at 19:12
@comprehensible: slightly faster with `(v.x << 4) + (v.y << 2) + v.z + 21`. – Apr 03 '17 at 09:57
I am programming in .net and the memory access happens in 8-bit segments. which function like syllables of memory, so it may perhaps be faster than higher compression which may call for two memory segments for one data sometimes. It gives a variance to the number of memory calls required, which may be 50% higher if half of the values bridge two memory segments. I have 32 gigs of memory. the +21 is that for encoding? i don't see a bit mask. awesome. – bandybabboon Apr 03 '17 at 10:11
@YvesDaoust that's not _necessarily_ faster, considering the number of carry operations required to add 21 instead of 1 3 times. – EvilTak Apr 04 '17 at 00:55
@YvesDaoust since the asker didn't mention it, I didn't consider memory efficiency as a requirement, since using those two bits would take a little more effort which the asker could put in himself. If s/he would have mentioned it, I would have considered putting it in the answer itself. – EvilTak Apr 04 '17 at 00:57

score 1 · Answer 3 · 2017-04-02T19:35:56.570

1

The most compact way is by writing a 27 digits number in base 3 (using a shift -1 -> 0, 0 -> 1, 1 -> 2).

The value of this number will range from 0 to 3^27-1 = 7625597484987, which takes 43 bits to be encoded, i.e. 6 bytes (and 5 spare bits).

This is a little saving compared to a packed representation with 4 two-bit numbers packed in a byte (hence 7 bytes/56 bits in total).

An interesting variant is to group the base 3 digits five by five in bytes (hence numbers 0 to 242). You will still require 6 bytes (and no spare bits), but the decoding of the bytes can easily be hard-coded as a table of 243 entries.

edited Apr 02 '17 at 19:35

answered Apr 02 '17 at 19:24

As i have to process 1/2 billion voxels, the 8 bit array taks about 2 gigs, whereas a vector3 int takes 8 gigs, the biggest concern is read and write speed. It is fairly fast to read through 1 billion 8 bit array on an i7, 1-2 seconds, and then doing 2-3 maths / conversion instructions for every number takes 2-3 times longer. so, every time 1 instruction is added to a 1bn data read op it takes about 1-2 seconds extra. I would use your suggestion to write arrays to disk, it does sound quite difficile hehe. thanks, base 3 conversion is cool. – bandybabboon Apr 02 '17 at 20:48
@comprehensible: "it does sound quite difficile": what ? – Apr 03 '17 at 09:35
I wasn't very well versed on base 3... I researched it and apparently it is equanimous to base 10 using 3x3x3. difficile.. my french is panoramic. i speak it. – bandybabboon Apr 03 '17 at 09:50
@comprehensible: "it is equanimous to base 10 using 3x3x3": no. Conversion from base 3 is very simple (it's a polynomial evaluation); conversion to base 3 involves divisions and would be disrecommended here, but as I suggested, you can tabulate it, giving excellent performance. – Apr 03 '17 at 09:55

how to encode 27 vector3's into a 0-256 value?

3 Answers3