1

I have the mindset keeping my Strings immutable, a single source of truth. As I take the same mindset into Rust, I find I have to do a lot of cloning. Since the Strings do not change, all the cloning is unnecessary. Below there is an example of this and link to the relevant playground.

Borrowing does not seem like an option as I would have to deal with references and their lifetimes. My next thought is to use something like Rc or Cow struct. But wrapping all the Strings with something like Rc feels unnatural. In my limited experience of Rust, I have never seen any exposed ownership/memory management structs, that is Rc and Cow. I am curious how a more experience Rust developer would handle such a problem.

Is it actually natural in Rust to expose ownership/memory management structs like Rc and Cow? Should I be using slices?

use std::collections::HashSet;

#[derive(Debug)]
enum Check {
    Known(String),
    Duplicate(String),
    Missing(String),
    Unknown(String)
}

fn main() {
    let known_values: HashSet<_> = [
        "a".to_string(),
        "b".to_string(),
        "c".to_string()]
            .iter().cloned().collect();

    let provided_values = vec![
        "a".to_string(),
        "b".to_string(),
        "z".to_string(),
        "b".to_string()
    ];

    let mut found = HashSet::new();

    let mut check_values: Vec<_> = provided_values.iter().cloned()
        .map(|v| {
            if known_values.contains(&v) {
                if found.contains(&v) {
                    Check::Duplicate(v)
                } else {
                    found.insert(v.clone());
                    Check::Known(v)
                }
            } else {
                Check::Unknown(v)
            }
        }).collect();

    let missing = known_values.difference(&found);

    check_values = missing
        .cloned()
        .fold(check_values, |mut cv, m| {
            cv.push(Check::Missing(m));
            cv
        });

    println!("check_values: {:#?}", check_values);
}
NebulaFox
  • 7,813
  • 9
  • 47
  • 65
  • *In my limited experience of Rust, I have never seen any exposed memory management structs* -- `String` and `Vec` manage their own memory, just like `Rc` does. – trent Mar 15 '20 at 13:38
  • 1
    In languages like Java strings are accessed by reference and the same objects are shared by many pieces of code. Rust doesn't work that way. Ownership is explicit. You'd have to deliberately pass a `&mut` reference or similar to have some other code mutate a string you own. I think you're carrying over habits from another language that don't apply to Rust. – John Kugelman Mar 15 '20 at 13:39
  • @trentcl This isn't the point of my question. Yes, I know `String` and `Vec` mange their own memory, but they still succumb to ownership rules. I can't pass a `String` or `Vec` around without moving or borrowing. This is why I focused on `Rc` and `Cow` in my question. – NebulaFox Mar 15 '20 at 14:27
  • @JohnKugelman I work in a lot of languages, so it is very likely that I am carrying over habits from another language. – NebulaFox Mar 15 '20 at 14:29
  • The impression that I'm getting is to encapsulate ownership/memory mangement. – NebulaFox Mar 15 '20 at 14:51
  • It seemed to me, from the phrasing of the part I quoted, that you weren't aware that `String` and `Vec` are also "memory management structs". Yes, if you want a pointer with shared ownership semantics, you probably want something like `Rc` or `Arc`. – trent Mar 15 '20 at 15:15
  • 1
    For your example the lifetimes of all the strings are long enough that you can just use &str everywhere. https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=795d60b705dab4561a20030be545ccca. I tend to use &str if I can, and String only if I really need independent ownership of a value, or to construct a new value. – Michael Anderson Mar 16 '20 at 03:26
  • @MichaelAnderson You are right and thanks for the advice. But this being example code, it is best not to read too much into it. It is a simplified version of the real thing and I kept it close as possible by using similar types. – NebulaFox Mar 16 '20 at 11:03

1 Answers1

0

From the discussion in the comments of my question, all the cloning of immutable Strings in the example is correct. The cloning is necessary due to Rust handling memory via ownership rather than a reference in other languages.

At best, without using Rc, I can see some reduction in the cloning by using move semantics on provided_values.

Update: Some interesting reading

Cow would not work in my example as it involves a borrowing of a reference. Rc would be what I would have to use. In my example everything has to be converted to Rc but I can see the potential that this could all be hidden away through encapsulation.


use std::collections::HashSet;
use std::rc::Rc;

#[derive(Debug)]
enum Check {
    Known(Rc<String>),
    Duplicate(Rc<String>),
    Missing(Rc<String>),
    Unknown(Rc<String>)
}

fn main() {
    let known_values: HashSet<_> = [
        Rc::new("a".to_string()),
        Rc::new("b".to_string()),
        Rc::new("c".to_string())]
            .iter().cloned().collect();

    let provided_values = vec![
        Rc::new("a".to_string()),
        Rc::new("b".to_string()),
        Rc::new("z".to_string()),
        Rc::new("b".to_string())
    ];

    let mut found = HashSet::new();

    let mut check_values: Vec<_> = provided_values.iter().cloned()
        .map(|v| {
            if known_values.contains(&v) {
                if found.contains(&v) {
                    Check::Duplicate(v)
                } else {
                    found.insert(v.clone());
                    Check::Known(v)
                }
            } else {
                Check::Unknown(v)
            }
        }).collect();

    let missing = known_values.difference(&found);

    check_values = missing
        .cloned()
        .fold(check_values, |mut cv, m| {
            cv.push(Check::Missing(m));
            cv
        });

    println!("check_values: {:#?}", check_values);
}

Playground

NebulaFox
  • 7,813
  • 9
  • 47
  • 65
  • 1
    You might also want to try `Rc` instead of `Rc` - since your strings are anyway immutable, this might work too, with one less layer of indirection. – Cerberus Mar 16 '20 at 04:45