2

I have a function using a constant generic:

fn foo<const S: usize>() -> Vec<[String; S]> {
    // Some code
    let mut row: [String; S] = Default::default(); //It sucks because of default arrays are specified up to 32 only
    // Some code
}

How can I create a fixed size array of Strings in my case? let mut row: [String; S] = ["".to_string(), S]; doesn't work because String doesn't implement the Copy trait.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Denis Sologub
  • 7,277
  • 11
  • 56
  • 123
  • 1
    You may try to play with procedural macros, but it's quite verbose and doesn't compile on stable – Alexey S. Larionov Aug 17 '20 at 12:29
  • 1
    @AlexLarionov const generics don't compile on stable either. – Peter Hall Aug 17 '20 at 12:29
  • 2
    Instead you can return `Vec` and process rows/columns separation in code – Alexey S. Larionov Aug 17 '20 at 12:30
  • @AlexLarionov, yes, I know about `Vec` but I hoped there is a way to use a fixed size array. – Denis Sologub Aug 17 '20 at 12:32
  • *arrays are specified up to 32 only* - Wasn't that limitation [lifted](https://github.com/rust-lang/rust/pull/74060)? – user4815162342 Aug 17 '20 at 16:42
  • Oh, I now see that it was lifted for [everything except `Default`](https://github.com/rust-lang/rust/pull/74060#issuecomment-653889119). – user4815162342 Aug 17 '20 at 16:46
  • It's hard to answer your question because it doesn't include a [MRE]. We can't tell what version of (nightly?) Rust you are using or what nightly features are enabled in the code. It would make it easier for us to help you if you try to reproduce your error on the [Rust Playground](https://play.rust-lang.org) if possible, otherwise in a brand new Cargo project, then [edit] your question to include the additional info. There are [Rust-specific MRE tips](//stackoverflow.com/tags/rust/info) you can use to reduce your original code for posting here. Thanks! – Shepmaster Aug 17 '20 at 16:53

2 Answers2

5

You can do it with MaybeUninit and unsafe:

use std::mem::MaybeUninit;

fn foo<const S: usize>() -> Vec<[String; S]> {
    // Some code

    let mut row: [String; S] = unsafe {
        let mut result = MaybeUninit::uninit();
        let start = result.as_mut_ptr() as *mut String;
        
        for pos in 0 .. S {
            // SAFETY: safe because loop ensures `start.add(pos)`
            //         is always on an array element, of type String
            start.add(pos).write(String::new());
        }

        // SAFETY: safe because loop ensures entire array
        //         has been manually initialised
        result.assume_init()
    };

    // Some code

    todo!()
}

Of course, it might be easier to abstract such logic to your own trait:

use std::mem::MaybeUninit;

trait DefaultArray {
    fn default_array() -> Self;
}

impl<T: Default, const S: usize> DefaultArray for [T; S] {
    fn default_array() -> Self {
        let mut result = MaybeUninit::uninit();
        let start = result.as_mut_ptr() as *mut T;
        
        unsafe {
            for pos in 0 .. S {
                // SAFETY: safe because loop ensures `start.add(pos)`
                //         is always on an array element, of type T
                start.add(pos).write(T::default());
            }

            // SAFETY: safe because loop ensures entire array
            //         has been manually initialised
            result.assume_init()
        }
    }
}

(The only reason for using your own trait rather than Default is that implementations of the latter would conflict with those provided in the standard library for arrays of up to 32 elements; I wholly expect the standard library to replace its implementation of Default with something similar to the above once const generics have stabilised).

In which case you would now have:

fn foo<const S: usize>() -> Vec<[String; S]> {
    // Some code

    let mut row: [String; S] = DefaultArray::default_array();

    // Some code

    todo!()
}

See it on the Playground.

eggyal
  • 122,705
  • 18
  • 212
  • 237
  • Beautiful answer, love the generalization. – user4815162342 Aug 17 '20 at 17:05
  • Should the semicolon after `T: Default` be a comma instead? Also (and this could just be a matter of style), you could use the pointer arithmetic without mutating `ptr`, relying on the optimizer to do the right thing: [playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb689b346718aab1f7fbc7611b94422b). That way the relation between the pointer and the array is a bit more direct. – user4815162342 Aug 17 '20 at 17:10
  • @user4815162342: Yes, it should have been a comma—already updated, thanks. And yes, that's valid too. – eggyal Aug 17 '20 at 17:11
  • @user4815162342: I've updated my answer to use your suggestion. – eggyal Aug 17 '20 at 17:15
  • Hmm, I actually think the assignment in the loop is unsafe because it will run the destructor on the uninitialized value if `T` has one! You should use `start.add(pos).write(T::default())` instead, which "Overwrites a memory location with the given value without reading or dropping the old value." – user4815162342 Aug 17 '20 at 17:24
  • In short, I can do it but using unsafe code. Thank you for answer! I hope the creators will implement this opportunity in future releases to escape `unsafe` blocks. – Denis Sologub Aug 17 '20 at 17:29
  • @user4815162342: Good catch! – eggyal Aug 17 '20 at 17:30
  • 1
    @Шах please take note of update per comment above. This almost certainly will be implemented in the standard library before const generics are stabilised. These sorts of bumps are what you have to deal with when you choose to use unfinished, pre-release features. – eggyal Aug 17 '20 at 17:31
  • Another possibility is to cast the pointer `as *mut MaybeUninit` and assign using `*start.add(pos) = MaybeUninit::new(T::default())`. But `ptr.write(T::default())` seems a bit cleaner. And the `MaybeUninit` docs provide a [third possibility](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#initializing-an-array-element-by-element) that even uses `transmute` in the final step. – user4815162342 Aug 17 '20 at 17:43
  • @user4815162342: No, that would produce `[MaybeUninit; S]` as opposed to a `MaybeUninit<[T; S]>`—and then you have a problem getting `[T; S]` out from the result. Okay, with transmute it can be done... but I'm not sure I understand the benefit.. – eggyal Aug 17 '20 at 17:45
  • The alternative assignment works because `result` would still contain `MaybeUninit<[T; S]>`, only the type of the individual pointer would change. The third option from the documentation is different and does require the final transmute. I'm not proposing to change the code in the answer, just examining alternatives in order to understand them and perhaps spot something unsafe in the approach taken here - which I haven't been able to. This is why people are wary of `unsafe`, you are never *completely* certain you didn't forget something. – user4815162342 Aug 17 '20 at 17:54
  • @user4815162342: Aah, I see what you're saying. But writing `MaybeUninit` into a `T` feels very wrong... even though it's probably (almost certainly?) harmless, I think it's relying a bit too much on undocumented behaviour? – eggyal Aug 17 '20 at 17:57
  • Not really, `MaybeUninit` is the one place where such things are [explicitly documented](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#initializing-an-array-element-by-element) to be allowed. (The linked documentation performs a possibly even "worse" transmute.) – user4815162342 Aug 17 '20 at 18:09
  • @user4815162342: Sure, the docs transmute `[MaybeUninit; S]` to `[T; S]` but unless I'm mistaken they don't ever write a `MaybeUninit` into a `T`? – eggyal Aug 17 '20 at 18:22
  • So, really this is the same as [What is the proper way to initialize a fixed length array?](https://stackoverflow.com/a/31361031/155423); [Why use `ptr::write` with arrays of `MaybeUninit`s?](https://stackoverflow.com/q/56997322/155423); [How do I create and initialize an immutable array?](https://stackoverflow.com/q/26435046/155423); etc. – Shepmaster Aug 17 '20 at 18:36
  • My proposed assignment doesn't either, it writes `MaybeUninit` into a `*mut MaybeUninit`. The latter is of course obtained by a cast from `T`, but that `T` is sitting inside a `MaybeUninit<[T; S]>` itself. It should be as safe as the transmute of the whole thing, since the whole point of `MaybeUninit` is that it is "compatible" with `T`. But I'm not a compiler expert, so take my opinion with a grain of salt. – user4815162342 Aug 17 '20 at 18:39
  • BTW I would omit the panic handling. It's unnecessary for the string example because `String::new` can't panic, it will abort the whole process if it's unable to allocate. In the generic code `T::default()` could panic, but I'd still opt to leak in that case for simplicity - if someone panics in the middle of `T::default()`, they deserve much worse than several leaked `T`s. – user4815162342 Aug 17 '20 at 18:43
  • @user4815162342: That `String::new()` doesn't panic (indeed it doesn't even attempt to allocate—that only happens if/when one pushes something onto the string), nor indeed `ptr::add()` and `ptr::write()`, are really implementation details of the standard library... I guess the docs *ought* to state if they might panic, and they don't, so perhaps one can rely on that? Not entirely sure whether panics are guaranteed to be noted in the docs if they might occur. – eggyal Aug 17 '20 at 18:56
  • I believe both abort on memory failure and `String::new()` not allocating are documented. The former is I believe documented in the nomicon, and `String` is publically [based on `Vec`](https://doc.rust-lang.org/std/string/struct.String.html#method.as_mut_vec), and it's [documented for `Vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html#guarantees). As a systems language Rust takes great care to document aspects of standard library types that are helpful when doing unsafe and FFI. – user4815162342 Aug 17 '20 at 20:15
  • But even if you had to handle panic explicitly, I think an object implementing `Drop` would be a better mechanism. `catch_unwind` in the middle of a loop looks like it might wreak havoc with LLVM's optimizations, and the [documentation](https://doc.rust-lang.org/beta/std/panic/fn.catch_unwind.html) warns against using it. I might be wrong, but I get a strong sense from the docs that `catch_unwind` is meant to catch panics in servers executing plugins and such, and for things like test harnesses, not for low-level code like this. – user4815162342 Aug 17 '20 at 20:19
  • @user4815162342 can’t do it in `Drop` because you won’t know there which elements have been initialised and which haven’t. – eggyal Aug 17 '20 at 20:21
  • The `Drop` could be implemented on a custom RAII-style object (stack-allocated inside `default_array` and therefore zero-cost) which _could_ contain that information. – user4815162342 Aug 17 '20 at 20:33
  • For anyone who's interested, I did as @user4815162342 suggested [here](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a09e86e438b39c64c64eef5c7abef150) but it's pretty distracting from the answer to the original question so I've rolled back to not dealing with memory leaks. – eggyal Aug 18 '20 at 17:46
  • Looks like a fun exercise! It would be interesting to compare assembly output of that version with the version using `catch_unwind` (or simply benchmark them both!) to check which one is more efficient in the happy (non-panicking) case. – user4815162342 Aug 18 '20 at 17:52
-2

As of now, there is no way to compile constant generics. As @AlexLarionov said, you can try to use procedural macros, but that approach still has its bugs and limitations.

If you need a generic that has to be a number, you can use the Num crate, or the more verbose std::num.

Fluffyeater
  • 35
  • 1
  • 3
  • Nightly version can compile constant generics but seems, there isn’t a way for an initialisation of `String` array. – Denis Sologub Aug 17 '20 at 13:46
  • @Fluffyeater The OP is aware that const generics are not yet available on stable, but is explicitly opting into them on nightly in order to see how they work and eventually provide feedback to developers. (This is how nightly is meant to be used.) – user4815162342 Aug 17 '20 at 17:41