0

I am writing a program which will receive user input from CSV or JSON (that doesn't really matter). There are potentially many inputs (each line of a CSV for example), which would reference different structs. So, I need to return an instance of a struct for each input string, but I don't know upfront which struct that would be. My attempt (code doesn't compile):

fn main () {
    let zoo: Vec<Box<dyn Animal>>;
    let user_input = "Cat,Persik";
    let user_input = user_input.split(",");
    match user_input.nth(0) {
        "Cat" => zoo.push(Cat(user_input.nth(0))),
        _ => zoo.push(Dog(user_input.nth(0))) //here user would be expected to provide a u8
    } 

}

trait Animal {}

struct Dog {
    age: u8,
}
impl Animal for Dog {}

struct Cat {
    name: String,
}
impl Animal for Cat {}

One way to do it is with if statements like this. But if there are hundreds of animals that would make the code pretty ugly. I have a macro which returns struct name for an instance, but I couldn't figure out a way to use that. I also thought about using enum for this, but couldn't figure out either.

  1. Is there a shorter and more concise way of doing this?

  2. Doesn't this way limit me in using only methods defined in the Animal trait on items of zoo? If so, is there a way around this constraint?

Essentially, I want to get a vector of structs, and to be able to use their methods freely. I don't know how many there will be, and I don't know in advance which structs exactly.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Anatoly Bugakov
  • 772
  • 1
  • 7
  • 18
  • If you need to handle each individual animal as a Dog, a Cat, etc, then the best way would be to create an enum with a variant per animal. `enum Animal { Dog(Dog), Cat(Cat) }` You could use dynamic dispatch with [downcasting](https://stackoverflow.com/questions/33687447/how-to-get-a-reference-to-a-concrete-type-from-a-trait-object) but an enum is quite a bit easier. – PitaJ Feb 18 '22 at 20:49
  • As for your question. You could use a macro to reduce repetition but given you'll have to write custom code to parse ages and stuff, I'd recommend just writing it out verbosely. – PitaJ Feb 18 '22 at 20:51
  • Thanks @PitaJ, I tried enum Animal { Dog(Dog), Cat(Cat) } but that didn't work. You can't match string with an enum variant. Unless you mean match with if statement (like in my example) and store enum in the vector? – Anatoly Bugakov Feb 18 '22 at 21:24
  • Yes, I meant match like in your example and store enum in the vector. – PitaJ Feb 19 '22 at 01:05

1 Answers1

2

It's often helpful to use helper functions for parsing things. We can implement this function on the trait itself to keep the parsing function associated with the trait.

We'll have the function return Result<Box<dyn animal>, ()> since it's possible for parsing to fail. (We'd probably want a proper error type instead of () in real code.)

trait animal{}

impl dyn animal {
    fn try_parse(kind: &str, data: &str) -> Result<Box<dyn animal>, ()> {
        match kind {
            "Cat" => Ok(Box::new(Cat { name: data.into() })),
            "Dog" => Ok(Box::new(Dog { age: data.parse().map_err(|_e| ())? })),
            _ => Err(()),
        }
    }
}

Ok, so now we have a function that can be used to parse a single animal, and has a way to signal failure. We could now parse a comma-separated string building off of this function, again signaling errors if the string doesn't contain a comma:

impl dyn animal {
    // try_parse()

    fn try_parse_comma_separated(input: &str) -> Result<Box<dyn animal>, ()> {
        let split = input.split(',');
        
        let parts = (split.next(), split.next());
        
        // parts is a tuple of two Option<&str>.  We can only proceed if both
        // are Some.
        
        match parts {
            (Some(kind), Some(data)) => Self::try_parse(kind, data),
            _ => Err(()),
        }
    }
}

Now our main() is trivial:

fn main() {
    let mut zoo: Vec<Box<dyn animal>> = vec![];
    
    let user_input = "Cat,Persik";
    
    zoo.push(<dyn animal>::try_parse_comma_separated(user_input).unwrap());
}

Separating things out like this allows us to reuse these functions in other interesting ways. Let's say you wanted to parse a string like "Cat,Persik,Dog,5" as two values. That can now be done by using iterators and mapping over our parse function:

fn main() {
    let user_input = "Cat,Persik,Dog,5";
    
    let zoo = user_input.split(',').collect::<Vec<_>>()
        .chunks_exact(2) // Group the input into slices of 2 elements each
        .map(|s| <dyn animal>::try_parse(s[0], s[1]).unwrap())
        .collect::<Vec<_>>();
}

To answer your question about a better way to do this when managing many implementors of animal, you could move the implementation-specific parsing logic into a similar function on each implementation instead, and call that functionality from <dyn animal>::try_parse(). The parsing logic has to live somewhere.


Doesn't this way limit me in using only methods defined in animal Trait on items of zoo? If so, is there a way around this constraint?

Without downcasting, yes. Generally when you have a collection of polymorphic values like dyn animal, you want to use them polymorphically -- invoking only methods defined on the animal trait. Each implementation of the trait on a specific type can implement the trait's interface however it makes sense for that animal.

Downcasting is non-trivial, but with a helper trait it becomes a bit more palatable:

trait AsAny {
    fn as_any(&self) -> &dyn Any;
}

impl<T: 'static + animal> AsAny for T {
    fn as_any(&self) -> &dyn Any { self }
}

trait animal: AsAny { }

Now, given an animal: Box<dyn Animal> you can use animal.as_any().downcast_ref::<Dog>() for example, which gives you back an Option<&Dog>. This will be None if the boxed animal isn't a dog. Based on the zoo in the last example (with a dog and a cat):

let dogs = zoo.iter()
    // Filter down the zoo to just dogs (produces a sequence of &Dog)
    .filter_map(|animal| animal.as_any().downcast_ref::<Dog>());

// We should only find one dog in the zoo.
assert_eq!(dogs.count(), 1);

But this should be an absolute last resort when using your animals polymorphically isn't an option.

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • Thank you very much Sir, you are a legend! Downcasting is a new consept to me, but it was very useful to look into that. My only concern is the match statement in try_parse function(essentially mapping a string to the struct). If the number of potential structs (animals) grows to hundreds/thousands, this can get out of hand. But as far as I understand, there is no more robust way of doing it, right? – Anatoly Bugakov Feb 19 '22 at 12:05
  • 1
    @АнатолийБугаков There's variations of the same approach, such as having a HashMap of parsing functions. At the end of the day, you need some way to map the identifier to the parsing function. Macros might be able to alleviate some of the maintenance burden, but there is a limit to their power as well. – cdhowie Feb 19 '22 at 20:04