0

I am using the git2 crate and would like to get Statuses for the repo and store this in my app struct for reuse later since it is expensive to create. The problem is that Statuses references the Repository from which it was created. As far as I understand from this question: Why can't I store a value and a reference to that value in the same struct?, I can't return an owned item along with a refence to it since the address of the owned item will change when it is returned from the function and moved, thereby making the reference invalid. The below is a minimal example of what I am trying to do, what is the correct way to tackle this?

use git2::{Repository, Statuses};

struct App<'a> {
    repo: Repository,
    statuses: Statuses<'a>,
}
impl<'a> App<'a> {
    fn new() -> Self {
        let repo = Repository::open("myrepo").unwrap();
        let statuses = repo.statuses(None).unwrap();
        App { repo, statuses }
    }
}

fn main() {
    let mydata = App::new();
    dbg!(mydata.statuses.len());
}

Below is the only solution I have found (also taken from the above question), which is to make Statuses optional and mutate the app data after Repository has already been returned from ::new(). This seems hacky and not idiomatic, and doesn't compile anyway.

use git2::{Repository, Statuses};

struct App<'a> {
    repo: Repository,
    statuses: Option<Statuses<'a>>,
}
impl<'a> App<'a> {
    fn new() -> Self {
        let repo = Repository::open("myrepo").unwrap();
        App {
            repo,
            statuses: None,
        }
    }
}

fn main() {
    let mut mydata = App::new();
    mydata.statuses = mydata.repo.statuses(None).ok();
    dbg!(mydata.statuses.unwrap().len());
}
error[E0597]: `mydata.repo` does not live long enough
  --> src/main.rs:19:23
   |
19 |     mydata.statuses = mydata.repo.statuses(None).ok();
   |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^ borrowed value does not live long enough
20 |     dbg!(mydata.statuses.unwrap().len());
21 | }
   | -
   | |
   | `mydata.repo` dropped here while still borrowed
   | borrow might be used here, when `mydata` is dropped and runs the destructor for type `App<'_>`

EDIT: Some additional context: I am making an app with egui, so the App struct is the application state. Amoung other things, it will list all git repos in a directory, and display information about their statuses. I measured the repo.statuses(None).unwrap() call and and for ~10 repositories it took a total of 4ms, so too slow to call on each loop of the app. The obvious solution I could think of was to store the data in the app's state (App) but that doesn't seem possible so I'm looking for alternative approaches.

Max888
  • 3,089
  • 24
  • 55
  • Maybe you need to `to_owned()` on that? – tadman Oct 01 '22 at 23:36
  • 1
    @PitaJ The OP mentions this question. – Chayim Friedman Oct 02 '22 at 01:11
  • The question isn't about just "returning" a self-referential struct - it's about storing a value and a reference to that value in the same struct. The same restriction applies regardless of context. The struct could be moved, and in doing so invalidate the reference, so the reference can't be stored with the value. – PitaJ Oct 02 '22 at 01:19
  • 1
    You should try to apply some of the crates in the linked Q&A, particularly ouroboros and self-cell, if you really want to pursue this design. – kmdreko Oct 02 '22 at 01:32
  • What's the context of your problem? There is a good chance that the proper fix is to restructure how your program uses the `git2` crate. Also, *"it is expensive to create"* - did you measure it? Are you sure? As it doesn't own its data, but instead references the `Repository` object, I would have imagined it being as lazy as possible. – Finomnis Oct 02 '22 at 07:43
  • @Finomnis I am making an app with egui, so the `App` struct is the application state. Amoung other things, it will list all git repos in a directory, and display information about their statuses. I agree it is surprising that creating `Statuses` is slow, but I did measure it and for ~10 repositories it took a total of 4ms, so too slow to create on each loop of the app. To me the obvious thing to do was to try and store the data on in the app's state (`App`). – Max888 Oct 02 '22 at 08:54
  • @Max888 Sounds to me like you actually want to extract the information you need from the `Statuses` object, and then release the `Statuses` object again. You don't want the `Statuses` object to stick around, I think. – Finomnis Oct 02 '22 at 09:02
  • @Finomnis yes, that's what I was just thinking. – Max888 Oct 02 '22 at 09:05

1 Answers1

1

I think there are two solutions:

  • Copy the data you want out of the Statuses object and then release it
  • Use an external crate like self_cell to create a self-referential object. Note that this object can then no longer provide mut access to the Repository.

I'd argue that the first option would be the way to go in your case, because to my understanding, Statuses is simply a collection of paths with a status for each path.

use std::collections::HashMap;

use git2::{Repository, Status};

struct App {
    repo: Repository,
    statuses: HashMap<String, Status>,
}
impl App {
    fn new() -> Self {
        let repo = Repository::open(".").unwrap();
        let statuses = repo
            .statuses(None)
            .unwrap()
            .iter()
            .map(|el| (el.path().unwrap().to_string(), el.status()))
            .collect::<HashMap<_, _>>();
        App { repo, statuses }
    }
}

fn main() {
    let mydata = App::new();
    dbg!(mydata.statuses.len());
    println!("{:#?}", mydata.statuses);
}
[src/main.rs:24] mydata.statuses.len() = 2
{
    "Cargo.toml": WT_MODIFIED,
    "src/main.rs": INDEX_MODIFIED | WT_MODIFIED,
}
Finomnis
  • 18,094
  • 1
  • 20
  • 27
  • Ah I'm such a dingus, I hadn't realised `Status` was owned. `StatusEntry` is a reference so I somehow ended up assuming `Status` was as well. Thanks! – Max888 Oct 02 '22 at 09:51