2

I have an enum which represents every possible instruction on an 8080 processor. An instruction can be 1, 2 or 3 bytes long, depending on whether it has information associated with it and how much. For example:

#[allow(non_camel_case_types)]
enum Instruction {
    None,
    NOP,
    LXI_B_D16(u8, u8),
    STAX_B,
    INX_B,
    MVI_B_D8(u8),
    MVI_C_D8(u8),
    RRC,
    LXI_D_D16(u8, u8),
    MVI_D_D8(u8),
    RAL,
    DCR_E,
    MVI_E_D8(u8),
    LXI_H_D16(u8, u8),
    SHLD(u16),
    LHLD(u16),
    // ...
}

When it comes to assigning memory addresses to instructions, I iterate instruction-by-instruction through the binary file, using the length of each instruction to make sure my loop doesn't land halfway through an instruction and give me garbage. I do this with a huge match expression, which returns a tuple containing the correct instruction and its length:

match value {
    0x00 => (Instruction::NOP, 1),
    0x01 => (Instruction::LXI_B_D16(d1, d2), 3),
    0x02 => (Instruction::STAX_B, 1),
    0x05 => (Instruction::DCR_B, 1),
    0x06 => (Instruction::MVI_B_D8(d1), 2),
    0x07 => (Instruction::RLC, 1),
    0x0e => (Instruction::MVI_C_D8(d1), 2),
    0x0f => (Instruction::RRC, 1),
    0x11 => (Instruction::LXI_D_D16(d1, d2), 3),
    0x19 => (Instruction::DAD_D, 1),
    // ...
}

This is ugly, but I don't want to associate this length number into the type because it's really only important when I'm parsing the file.

It seems like I should be able to just infer the length of an instruction from the shape of the variant. Anything with no argument is length 1, anything with one u8 argument is length 2, and anything with one u16 or two u8 arguments is length 3.

I haven't been able to work out how to get this shape programatically. I can't call len() on it like an array or vector, for example.

I don't think this is a duplicate of How to get the number of elements in an enum as a constant value? as I'm not looking for a way to get the number of variants in the enum, but the number of arguments of any individual variant.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
rivas
  • 45
  • 5
  • I don't think it's possible, and I also would like to know what you are doing with this information. You should not need this. – Stargateur Jan 28 '19 at 13:01
  • 1
    It seems that the underlying question is more about removing repetition. One way to do that in Rust is with macros. You could use a macro to generate the instructions enum at the same time as that big match expression. – Peter Hall Jan 28 '19 at 13:25
  • @stargateur An 8080 ROM (in my case, Space Invaders) is read into my program as a single array of u8s. Each instruction is either 1, 2 or 3 u8s long: The first u8 represents the instruction type, and the remaining bytes are the associated data. For example, a JMP command has an associated u16 memory address to mark the jump location. As I parse this array, I iterate by instruction byte, so I need to know the length of the instruction to know by how much to increment the pointer. This might be a very naive way of doing this, I'm not sure! – rivas Jan 28 '19 at 15:52

1 Answers1

1

As mentioned in the comments, you can write a macro to write the code for you.

In the interest of laziness, I've simplified the enum definition to always rquire parenthesis. This isn't required, it just simplified my job.

Once the enum is parsed by the macro, we can generate an impl block with a function matching each variant. We pass the arguments of each variant to an inner macro that performs the count for us. This function returns the number of elements in each variant:

macro_rules! instructions {
    (enum $ename:ident {
        $($vname:ident ( $($vty: ty),* )),*
    }) => {
        enum $ename {
            $($vname ( $($vty),* )),*
        }

        impl $ename {
            fn len(&self) -> usize {
                match self {
                    $($ename::$vname(..) => instructions!(@count ($($vty),*))),*
                }
            }
        }
    };

    (@count ()) => (0);
    (@count ($a:ty)) => (1);
    (@count ($a:ty, $b:ty)) => (2);
    (@count ($a:ty, $b:ty, $c:ty)) => (3);
}

instructions! {
    enum Instruction {
        None(),
        One(u8),
        Two(u8, u8),
        Three(u8, u8, u8)
    }
}

fn main() {
    println!("{}", Instruction::None().len());
    println!("{}", Instruction::One(1).len());
    println!("{}", Instruction::Two(1,2).len());
    println!("{}", Instruction::Three(1,2,3).len());

}

You could also write a custom derive macro that does the same functionality.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366