0

In dynamic languages, like Clojure, it is easy to express collections with different types:

{:key1 "foo", :key2 [34 "bar" 4.5], "key3" {:key4 "foobar"}}

In Rust, the preferred way to implement such collections is using trait objects or enums. The use of the Any trait object seems to be the most flexible approach (if there isn't a fixed number of known type alternatives) as it allows downcasting to the actual object types:

let mut vector: Vec<Box<Any>> = Vec::new();
vector.push(Box::new("I’m"));
vector.push(Box::new(4 as u32));
console!(log, vector[0].downcast_ref::<&str>());
console!(log, vector[1].downcast_ref::<u32>());

This approach seems to be discouraged. What are its disadvantages?

Peter Hall
  • 53,120
  • 14
  • 139
  • 204
dilvan
  • 2,109
  • 2
  • 20
  • 32
  • What does Clojure have to do with your question? – Shepmaster Jul 09 '18 at 18:56
  • It is just an example of a heterogeneous collection in another language. – dilvan Jul 09 '18 at 20:44
  • What are the downsides of using such a collection in Clojure? Pretty sure you'll find the same downsides in any language. – Shepmaster Jul 09 '18 at 20:48
  • 2
    "(if there isn't a fixed number of known type alternatives)" If you don't know the possible types of your data, then you can't downcast to them. Which means you can't do anything with them. – Wesley Wiser Jul 09 '18 at 21:00

1 Answers1

7

What are its disadvantages?

The main disadvantage is that, when you access a value from a collection of &Any, the only thing you can do with it is downcast it to a specific, known type. If there is a type you don't know about, then the values of that type are completely opaque: you can literally do nothing with them except count how many there are. If you know the concrete type then you can downcast to it, but you would need to try each possible type that it could be.

Here's an example:

let a = 1u32;
let b = 2.0f32;
let c = "hello";

let v: Vec<&dyn Any> = vec![&a, &b, &c];

match v[0].downcast_ref::<u32>() {
    Some(x) => println!("u32: {:?}", x),
    None => println!("Not a u32!"),
}

Notice that I had to explicitly downcast to a u32. Using this approach would involve a logical branch for every possible concrete type, and no compiler warnings if I had forgotten a case.

Trait objects are more versatile because you don't need to know the concrete types in order to use the values - as long as you stick to only using methods of the trait.

For example:

let v: Vec<&dyn Debug> = vec![&a, &b, &c];

println!("u32: {:?}", v[0]); // 1
println!("u32: {:?}", v[1]); // 2.0
println!("u32: {:?}", v[2]); // "hello"

I was able to use all of the values without knowing their concrete types, because I only used the fact that they implement Debug.

Both of those approaches have a downside over using a concrete type in a homogeneous collection: everything is hidden behind pointers. Accessing the data is always indirect, and that data could end up being spread around in memory, making it much less efficient to access and harder for the compiler to optimize.

Making the collection homogeneous with an enum looks like this:

enum Item<'a> {
    U32(u32),
    F32(f32),
    Str(&'a str),
}

let v: Vec<Item> = vec![Item::U32(a), Item::F32(b), Item::Str(c)];
match v[0] {
    Item::U32(x) => println!("u32: {:?}", x),
    Item::F32(x) => println!("u32: {:?}", x),
    Item::Str(x) => println!("u32: {:?}", x),
}

Here, I still have to know all of the types, but at least there would be a compiler warning if I missed one. Also notice that the enum can own its values, so (apart from the &str in this case) the data can be tightly packed in memory, making it faster to access.

In summary, Any is rarely the right answer for a heterogeneous collection, but both trait objects and enums have their own trade-offs.

Wesley Wiser
  • 9,491
  • 4
  • 50
  • 69
Peter Hall
  • 53,120
  • 14
  • 139
  • 204
  • Additionally, if you want a group of `N` elements of heterogeneous types where the size of the group is known in advance and where each element has a fixed type (e.g. the first one will always be a `u32`, the second one will always be an `f32` and so on), then what you want is a tuple: `let v: (u32, f32, &str) = (a, b, c);` – Jmb Jul 10 '18 at 06:52