-2

I'm building a stream-based file parser. The file has the format $TYPE$CONTENT\n where TYPE is 4 bytes and content up to \n.

$TYPE is either an attribute of the current record, or indicates that a new record starts. For example:

strtPerson
nameMustermann
surnMax
strtPerson
nameMustermann
surnRenate

My approach was to open the file, get an iterator over the lines, split the lines and then feed the lines to a parser:

use std::io::{self, BufRead, BufReader};

pub trait Record {
    fn add_line(&mut self, type_: String, content: String) -> &mut dyn Record;
}

pub struct File {
    pub current_record: Option<Box<dyn Record>>,
    pub records: Vec<Box<dyn Record>>,
}

impl File {
    pub fn push_last_record(&mut self) {
        if self.current_record.is_some() {
            self.records.push(self.current_record.take().unwrap());
        }
    }
}

impl Record for File {
    fn add_line(&mut self, type_: String, content: String) -> &mut dyn Record {
        match &type_[..] {
            "strt" => {
                self.push_last_record();
                self.current_record = Some(Box::new(File {
                    current_record: None,
                    records: Vec::new(),
                }));
            }
            _ => {
                if let Some(current) = &mut self.current_record {
                    current.add_line(type_, content);
                }
            }
        }
        self
    }
}

fn main() -> io::Result<()> {
    let f =
        "strtPerson\nnameMustermann\nsurnMax\nstrtPerson\nnameMustermann\nsurnRenate".as_bytes();
    let reader = BufReader::new(f);
    let mut f = File {
        current_record: None,
        records: Vec::new(),
    };
    for line in reader.lines() {
        let line = line.unwrap();
        f.add_line(line[..4].to_owned(), line[5..].to_owned());
    }
    println!("there are {} records read", f.records.len());
    f.push_last_record();
    println!("there are now {} records read", f.records.len());
    Ok(())
}

This solution works, but I think it's cumbersome and error prone to require a call to push_last_record and I think there could be a more idiomatic solution to this.

My main issue is how to create the new Record if the line starts with "start" and somehow save a mutable reference to that so I can directly call the add_line function on the latest record.

It boils down to creating a Box, putting it in the vector and save a mutable reference to it in the struct.

Is there a ready-made pattern for this or could anyone give me a hint on how to make this clean?

EDIT:

Thanks to the hint with last_mut(), I rewrote my code. I'm not sure if it's idiomatic or efficient, but I think it's way cleaner than before.

use std::io::{self, BufRead, BufReader};

pub trait Record {
    fn add_line(&mut self, type_: String, content: String) -> &mut dyn Record;
}

pub struct File<'a> {
    pub current_record: Option<&'a mut Box<dyn Record>>,
    pub records: Vec<Box<dyn Record>>,
}

impl Record for File<'_> {
    fn add_line(&mut self, type_: String, content: String) -> &mut dyn Record {
        match &type_[..] {
            "strt" => {
                self.records.push(Box::new(File {
                    current_record: None,
                    records: Vec::new(),
                }));
            }
            _ => {
                if let Some(current) = &mut self.records.last_mut() {
                    current.add_line(type_, content);
                }
            }
        }
        self
    }
}

fn main() -> io::Result<()> {
    let f =
        "strtPerson\nnameMustermann\nsurnMax\nstrtPerson\nnameMustermann\nsurnRenate".as_bytes();
    let reader = BufReader::new(f);
    let mut f = File {
        current_record: None,
        records: Vec::new(),
    };
    for line in reader.lines() {
        let line = line.unwrap();
        f.add_line(line[..4].to_owned(), line[5..].to_owned());
    }
    println!("there are {} records read", f.records.len());
    Ok(())
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
stffn
  • 35
  • 2
  • 6
    This code has a lot of typos. `match &code[..]` despite `code` being an `i32`, `&mut den Record` instead of `&mut dyn Record`, missing close braces, a custom `File` struct used in conjuction with the standard library `File` struct, and more. It'd be very helpful if you could put together a [mre]. – Aplet123 Jul 20 '21 at 18:07
  • 2
    It looks like your question might be answered by the answers of [Is there an idiomatic way to keep references to elements of an ever-growing container?](https://stackoverflow.com/q/41034046/155423); [How can a function append a value to a vector and also return that value?](https://stackoverflow.com/q/50980100/155423). If not, please **[edit]** your question to explain the differences. Otherwise, we can mark this question as already answered. – Shepmaster Jul 20 '21 at 18:12
  • 1
    See also [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/q/32300132/155423) – Shepmaster Jul 20 '21 at 18:15
  • 2
    @Shepmaster thanks for the other questions where if found the last_mut() function of vec. I think that's doing the job. – stffn Jul 20 '21 at 19:07

1 Answers1

0

For the current_record you should store the index in the other Vec instead of using a temporary reference. Self-referential structs are not safe in Rust, and even if they worked, temporary references in structs are incredibly limiting and impractical to use.

Alternatively, you could use Arc instead of Box in order to allow one value to live in multiple places.

Kornel
  • 97,764
  • 37
  • 219
  • 309
  • Wouldn't be storing an index be somehow the same as using the last_mut()? The main difference would be, that i store one additional information. If i'd go for either solutions, it'll mess up if i share it... – stffn Aug 12 '21 at 20:38
  • If you can access the right element in some other way than using a temporary reference, that's all fine. If you had a more complex case where indices are not good enough, there's also [generational arena](https://lib.rs/search?q=generational+arena), `Arc`, and [crates that help making self-referential structs](https://lib.rs/crates/ouroboros). – Kornel Aug 29 '21 at 21:15