3

I'm doing some computational mathematics in Rust, and I have some large numbers which I store in an array of 24 values. I have functions that convert them to bytes and back, but it doesn't work fine for u32 values, whereas it works fine for u64. The code sample can be found below:

fn main() {
    let mut bytes = [0u8; 96]; // since u32 is 4 bytes in my system, 4*24 = 96
    let mut j;
    let mut k: u32;

    let mut num: [u32; 24] = [1335565270, 4203813549, 2020505583, 2839365494, 2315860270, 442833049, 1854500981, 2254414916, 4192631541, 2072826612, 1479410393, 718887683, 1421359821, 733943433, 4073545728, 4141847560, 1761299410, 3068851576, 1582484065, 1882676300, 1565750229, 4185060747, 1883946895, 4146];
    println!("original_num: {:?}", num);

    for i in 0..96 {
        j = i / 4;
        k = (i % 4) as u32;
        bytes[i as usize] = (num[j as usize] >> (4 * k)) as u8;
    }

    println!("num_to_ytes: {:?}", &bytes[..]);
    num = [0u32; 24];

    for i in 0..96 {
        j = i / 4;
        k = (i % 4) as u32;
        num[j as usize] |= (bytes[i as usize] as u32) << (4 * k);
    }

    println!("recovered_num: {:?}", num);
}

Rust playground

The above code does not retrieve the correct number from the byte array. But, if I change all u32 to u64, all 4s to 8s, and reduce the size of num from 24 values to 12, it works all fine. I assume I have some logical problem for the u32 version. The correctly working u64 version can be found in this Rust playground.

tinker
  • 2,884
  • 8
  • 23
  • 35

1 Answers1

10

Learning how to create a MCVE is a crucial skill when programming. For example, why do you have an array at all? Why do you reuse variables?

Your original first number is 0x4F9B1BD6, the output first number is 0x000B1BD6.

Comparing the intermediate bytes shows that you have garbage:

let num = 0x4F9B1BD6_u32;
println!("{:08X}", num);

let mut bytes = [0u8; BYTES_PER_U32];
for i in 0..bytes.len() {
    let k = (i % BYTES_PER_U32) as u32;
    bytes[i] = (num >> (4 * k)) as u8;
}

for b in &bytes {
    print!("{:X}", b);
}
println!();
4F9B1BD6
D6BD1BB1

Printing out the values of k:

for i in 0..bytes.len() {
    let k = (i % BYTES_PER_U32) as u32;
    println!("{} / {}", k, 4 * k);
    bytes[i] = (num >> (4 * k)) as u8;
}

Shows that you are trying to shift by multiples of 4 bits:

0 / 0
1 / 4
2 / 8
3 / 12

I'm pretty sure that every common platform today uses 8 bits for a byte, not 4.

This is why magic numbers are bad. If you had used constants for the values, you would have noticed the problem much sooner.

since u32 is 4 bytes in my system

A u32 better be 4 bytes on every system — that's why it's a u32.


Overall, don't reinvent the wheel. Use the byteorder crate or equivalent:

extern crate byteorder;

use byteorder::{BigEndian, ReadBytesExt, WriteBytesExt};

const LENGTH: usize = 24;
const BYTES_PER_U32: usize = 4;

fn main() {
    let num: [u32; LENGTH] = [
        1335565270, 4203813549, 2020505583, 2839365494, 2315860270, 442833049, 1854500981,
        2254414916, 4192631541, 2072826612, 1479410393, 718887683, 1421359821, 733943433,
        4073545728, 4141847560, 1761299410, 3068851576, 1582484065, 1882676300, 1565750229,
        4185060747, 1883946895, 4146,
    ];
    println!("original_num: {:?}", num);

    let mut bytes = [0u8; LENGTH * BYTES_PER_U32];
    {
        let mut bytes = &mut bytes[..];
        for &n in &num {
            bytes.write_u32::<BigEndian>(n).unwrap();
        }
    }

    let mut num = [0u32; LENGTH];
    {
        let mut bytes = &bytes[..];
        for n in &mut num {
            *n = bytes.read_u32::<BigEndian>().unwrap();
        }
    }

    println!("recovered_num: {:?}", num);
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Thank you. Indeed I got bitten by the magic numbers, didn't recognize that shift amount. – tinker Jan 08 '18 at 22:57
  • 1
    Actually I am not sure that `u32` is necessarily 4 bytes on every system. In C, `uint32_t` is not necessarily available on all platforms, because some platforms don't have 8-bits bytes (ie `CHAR_BIT != 8`). – Matthieu M. Jan 09 '18 at 10:22
  • 1
    @MatthieuM. I've always been under the impression that Rust's `u32` is always exactly 32 bits. Even on Arduino (8-bit register size), `i128` exist as groupings of smaller registers. – Shepmaster Jan 09 '18 at 15:30
  • @Shepmaster: It's easy to compose an integer out of 8-bits blocks, but when your blocks are 9 or 10 bits, then what do you do? From https://stackoverflow.com/questions/2098149/what-platforms-have-something-other-than-8-bit-char there appears to be 24-bits DSP for example, the same also mentions that POSIX requires 8-bits bytes so a lot of system should be okay. – Matthieu M. Jan 09 '18 at 15:42
  • 1
    @MatthieuM. I understand that there are such platforms, and that *C* may or may not handle them, but does *Rust* actually attempt to work for any of these types of platforms? – Shepmaster Jan 09 '18 at 15:43
  • 1
    @Shepmaster: *Rust All The Platforms!*. – Matthieu M. Jan 09 '18 at 16:29