5

So I am working on a little NES emulator using Rust and I am trying to be fancy with my status register. The register is a struct that holds some fields (flags) that contain a bool, the register itself is part of a CPU struct. Now, I want to loop through these fields and set the bool values based on some instruction I execute. However, am not able to implement a mutable iterator, I've implemented an into_iter() function and are able to iterate through the fields to get/print a bool value but how do I mutate these values within the struct itself? Is this even possible?

pub struct StatusRegister {
    CarryFlag: bool,
    ZeroFlag: bool,
    OverflowFlag: bool,
}

impl StatusRegister {
    fn new() -> Self {
        StatusRegister {
            CarryFlag: true,
            ZeroFlag: false,
            OverflowFlag: true,
        }
    }
}

impl<'a> IntoIterator for &'a StatusRegister {
    type Item = bool;
    type IntoIter = StatusRegisterIterator<'a>;

    fn into_iter(self) -> Self::IntoIter {
        StatusRegisterIterator {
            status: self,
            index: 0,
        }
    }
}

pub struct StatusRegisterIterator<'a> {
    status: &'a StatusRegister,
    index: usize,
}

impl<'a> Iterator for StatusRegisterIterator<'a> {
    type Item = bool;

    fn next(&mut self) -> Option<bool> {
        let result = match self.index {
            0 => self.status.CarryFlag,
            1 => self.status.ZeroFlag,
            2 => self.status.OverflowFlag,
            _ => return None,
        };
        self.index += 1;
        Some(result)
    }
}

pub struct CPU {
    pub memory: [u8; 0xffff],
    pub status: StatusRegister,
}

impl CPU {
    pub fn new() -> CPU {
        let memory = [0; 0xFFFF];
        CPU {
            memory,
            status: StatusRegister::new(),
        }
    }

    fn execute(&mut self) {
        let mut shifter = 0b1000_0000;
        for status in self.status.into_iter() {
            //mute status here!
            println!("{}", status);
            shifter <<= 1;
        }
    }
}

fn main() {
    let mut cpu = CPU::new();
    cpu.execute();
}
Stargateur
  • 24,473
  • 8
  • 65
  • 91
Vinnie
  • 213
  • 2
  • 8
  • 1
    Does this answer your question? [How do I write an iterator that returns references to itself?](https://stackoverflow.com/questions/30422177/how-do-i-write-an-iterator-that-returns-references-to-itself) – Stargateur May 23 '20 at 23:23
  • what the duplicate don't say is that in your case you want `&mut` that a big difference with `&` because reference implement copy but mutable reference don't. And so compiler need to borrow the iterator for the same `'a` lifetime to ensure the mutable reference is only borrow once. But you can't do that because Rust don't have GATs. I think std use unsafe for their mutable iterator implementation. – Stargateur May 23 '20 at 23:33
  • 1
    I don't believe the duplicate is appropriate. The intended behavior is not to implement a streaming iterator but just one that returns references to the original data structure. GATs are not relevant. This problem is in essence no different from `slice::IterMut` (except that it is possible to work around it in safe code, as the answer shows in two different ways). – trent May 24 '20 at 01:57
  • @trentcl but the problem of slice IterMut is GATs no ? – Stargateur May 24 '20 at 02:28
  • @Stargateur No, `IterMut` is just hard to implement in safe code, but it's easy to express in the type system. – trent May 24 '20 at 12:39

1 Answers1

9

Implementing an iterator over mutable references is hard in general. It becomes unsound if the iterator ever returns references to the same element twice. That means that if you want to write one in purely safe code, you have to somehow convince the compiler that each element is only visited once. That rules out simply using an index: you could always forget to increment the index or set it somewhere and the compiler wouldn't be able to reason about it.


One possible way around is chaining together several std::iter::onces (one for each reference you want to iterate over).

For example,

impl StatusRegister {
    fn iter_mut(&mut self) -> impl Iterator<Item = &mut bool> {
        use std::iter::once;
        once(&mut self.CarryFlag)
            .chain(once(&mut self.ZeroFlag))
            .chain(once(&mut self.OverflowFlag))
    }
}

(playground)

Upsides:

  • Fairly simple to implement.
  • No allocations.
  • No external dependencies.

Downsides:

  • The iterator has a very complicated type: std::iter::Chain<std::iter::Chain<std::iter::Once<&mut bool>, std::iter::Once<&mut bool>>, std::iter::Once<&mut bool>>.

So you if don't want to use impl Iterator<Item = &mut bool>, you'll have to have that in your code. That includes implementing IntoIterator for &mut StatusRegister, since you'd have to explicitly indicate what the IntoIter type is.


Another approach is using an array or Vec to hold all the mutable references (with the correct lifetime) and then delegate to its iterator implementation to get the values. For example,

impl StatusRegister {
    fn iter_mut(&mut self) -> std::vec::IntoIter<&mut bool> {
        vec![
            &mut self.CarryFlag,
            &mut self.ZeroFlag,
            &mut self.OverflowFlag,
        ]
        .into_iter()
    }
}

(playground)

Upsides:

  • The type is the much more manageable std::vec::IntoIter<&mut bool>.
  • Still fairly simple to implement.
  • No external dependencies.

Downsides:

  • Requires an allocation every time iter_mut is called.

I also mentioned using an array. That would avoid the allocation, but it turns out that arrays don't yet implement an iterator over their values, so the above code with a [&mut bool; 3] instead of a Vec<&mut bool> won't work. However, there exist crates that implement this functionality for fixed-length arrays with limited size, e.g. arrayvec (or array_vec).

Upsides:

  • No allocation.
  • Simple iterator type.
  • Simple to implement.

Downsides:

  • External dependency.

The last approach I'll talk about is using unsafe. Since this doesn't have many good upsides over the other approaches, I wouldn't recommend it in general. This is mainly to show you how you could implement this.

Like your original code, we'll implement Iterator on our own struct.

impl<'a> IntoIterator for &'a mut StatusRegister {
    type IntoIter = StatusRegisterIterMut<'a>;
    type Item = &'a mut bool;

    fn into_iter(self) -> Self::IntoIter {
        StatusRegisterIterMut {
            status: self,
            index: 0,
        }
    }
}

pub struct StatusRegisterIterMut<'a> {
    status: &'a mut StatusRegister,
    index: usize,
}

The unsafety comes from the next method, where we'll have to (essentially) convert something of type &mut &mut T to &mut T, which is generally unsafe. However, as long as we ensure that next isn't allowed to alias these mutable references, we should be fine. There may be some other subtle issues, so I won't guarantee that this is sound. For what it's worth, MIRI doesn't find any problems with this.

impl<'a> Iterator for StatusRegisterIterMut<'a> {
    type Item = &'a mut bool;

    // Invariant to keep: index is 0, 1, 2 or 3
    // Every call, this increments by one, capped at 3
    // index should never be 0 on two different calls
    // and similarly for 1 and 2.
    fn next(&mut self) -> Option<Self::Item> {
        let result = unsafe {
            match self.index {
                // Safety: Since each of these three branches are
                // executed exactly once, we hand out no more than one mutable reference
                // to each part of self.status
                // Since self.status is valid for 'a
                // Each partial borrow is also valid for 'a
                0 => &mut *(&mut self.status.CarryFlag as *mut _),
                1 => &mut *(&mut self.status.ZeroFlag as *mut _),
                2 => &mut *(&mut self.status.OverflowFlag as *mut _),
                _ => return None
            }
        };
        // If self.index isn't 0, 1 or 2, we'll have already returned
        // So this bumps us up to 1, 2 or 3.
        self.index += 1;
        Some(result)
    }
}

(playground)

Upsides:

  • No allocations.
  • Simple iterator type name.
  • No external dependencies.

Downsides:

  • Complicated to implement. To successfully use unsafe, you need to be very familiar with what is and isn't allowed. This part of the answer took me the longest by far to make sure I wasn't doing something wrong.
  • Unsafety infects the module. Within the module defining this iterator, I could "safely" cause unsoundness by messing with the status or index fields of StatusRegisterIterMut. The only thing allowing encapsulation is that outside of this module, those fields aren't visible.
SCappella
  • 9,534
  • 1
  • 26
  • 35
  • I still think GATs would solve this issue better but nice answer that sum up all current trick. I think vector or array one are probably the best in simplicity and efficiently because the chain one could be a little costly because it's O(n) where slice len is O(1). Note that the unsafe version is also O(n) but the compiler could optimise it quite easily. – Stargateur May 24 '20 at 01:53
  • @Stargateur The streaming iterator version in this case is strictly worse than a normal iterator because it still mutably borrows the original struct but cannot be `collect`ed. GATs solve problems where the type system is not currently expressive enough, but in this case the type system is already plenty expressive and the difficulty is in implementation. – trent May 24 '20 at 02:10
  • @trentcl I don't agree with you understanding of the problem. GATs will allow the iterator to be collected I don't see why it's should not in fact. GATs will allow the possibility to borrow the iterator and so ensure the mutable borrow only exist once. So "in this case the type system is already plenty expressive" I don't agree or I miss something. – Stargateur May 24 '20 at 02:30
  • @Stargateur You can't collect *any* streaming iterator. The API doesn't allow it (or other common uses like `let first = iter.next(); let second = iter.next(); foo(first, second);`). Streaming iterators are a generalization of non-streaming iterators (because they can be written where non-streaming iterators cannot) but they are less capable (because they can't do all the things non-streaming iterators can do). – trent May 24 '20 at 13:10