The short version:
I need to call a function in code I can't modify. The function takes an obj[]
, and I want to pass it a 'T[]
. I could use Array.map box
, but I'm trying to avoid creating an intermediate array. Is there a direct way to convert a 'T[]
to obj[]
without passing through Array.map box
or any other code that would create an intermediate array?
The long version:
I'm trying to write code that needs to interoperate with the PersistentVector class from FSharpx.Collections. (Specifically, I'm trying to implement RRB-Trees in F#). PersistentVector is basically a B-tree with a branching factor of 32. Each node in the tree contains one of two things: either other nodes (if the node is NOT a leaf node), or the items stored in the tree (if the node IS a leaf node). Now, the most natural way to represent this data structure in F# would be with a discriminated union like type Node<'T> = TreeNode of Node[] | LeafNode of 'T[]
. But for what I assume are performance reasons, the FSharpx.Collections.PersistentVector code instead defines its Node class as follows:
type Node(thread,array:obj[]) =
let thread = thread
new() = Node(ref null,Array.create Literals.blockSize null)
with
static member InCurrentThread() = Node(ref Thread.CurrentThread,Array.create Literals.blockSize null)
member this.Array = array
member this.Thread = thread
member this.SetThread t = thread := t
The threading code is unrelated to my current problem (it's used in transient vectors, which allow certain performance improvements), so let's remove it for the sake of creating the simplest summary of the problem. After removing the thread-related code, we have a Node
definition that looks like this:
type Node(array:obj[]) =
new() = Node([||])
with member this.Array = array
I want my implementation of RRB Trees to interoperate smoothly with the existing PersistentVector class, because the set of all valid PersistentVector trees is a strict subset of the set of all valid RRB trees. As part of that implementation, I have an RRBNode
class that inherits from Node
(and therefore must also take an obj[]
parameter in its constructor), and I often need to create new instances of either Node
or RRBNode
. For example, my implementation of RRBTree.ofArray
basically looks like this:
let ofArray<'T> (arr:'T[]) =
let leaves = arr |> Array.chunkBySize 32 |> Array.map Node
// More code here to build a tree above those leaf nodes
Or rather, I would like to define it like that, but I can't. The code above gives me a type mismatch error on the Array.map Node
call. The Node
constructor takes an obj[]
, and the error message reports that "the type 'T[]
is not compatible with the type obj[]
".
One approach I tried to solve this problem was to use box
and unbox
. https://stackoverflow.com/a/7339153/2314532 led me to believe that piping an array of any type through box
followed by unbox
would lead to casting that array to obj[]
. Yes, this is basically a misfeature of the .Net type system that compromises type safety (the cast that passes at compile time might fail at runtime) — but because I need to interoperate with the Node
class from PersistentVector, I don't have the benefits of type safety anyway (since Node
has used obj
instead of a discriminated union). So for this one part of my code, I actually want to tell the F# compiler "Stop protecting me here, please, I know what I'm doing and I've written extensive unit tests". But my attempt to use the box >> unbox
approach failed at runtime:
let intArray = [|1;2;3;4;5|]
let intNode = Node(intArray) // Doesn't compile: Type mismatch. Expecting obj[] but got int[]
let objArray : obj[] = intArray |> box |> unbox // Compiles, but fails at runtime: InvalidCastException
let objNode = Node(objArray)
(I made the type of objArray
explicit to make reading this minimal example as easy as possible, but I didn't need to write it: F# correctly infers its required type from the call to Node(objArray)
on the next line. The equivalent part of my actual code doesn't have explicit type annotations, but the obj[]
array type is still inferred, and it's that same int[]
to obj[]
cast, via |> box |> unbox
, that is causing the InvalidCastException
in my actual code.)
Another approach that would probably work would be to insert a call to Array.map box
into my Node
-creating pipeline:
let ofArray<'T> (arr:'T[]) =
let leaves = arr |> Array.chunkBySize 32 |> Array.map (Array.map box >> Node)
// More code here to build a tree above those leaf nodes
This does what I want (creates an array of Node
instances, which will become the leaves in the tree), but it creates an extra intermediate array in the process. I'd like to let the chunked arrays become the Node arrays directly, otherwise I'll be burning through O(N) memory and creating unnecessary GC pressure. I've thought about using Seq.cast
at some point in the pipeline, but I'm worried about the performance effects of using Seq.cast
. Turning arrays of known size (here, 32) into seqs means that other code, which needs arrays (to create Node
instances), would have to call Array.ofSeq
first, and Array.ofSeq
is implemented with ResizeArray
since it can't count on the size of the seqs in the general case. There's an optimization for seqs that are already arrays, but even that version of Array.ofSeq
creates a new array as its return value (which is precisely the correct behavior for the general case, but precisely what I'm trying to avoid here).
Is there any way for me to cast my 'T[]
arrays to obj[]
, deliberately forfeiting type safety, without creating the intermediate arrays that I've been trying so hard to avoid? Or am I going to have to write this one bit of code in C# so that I can do the unsafe things that the F# compiler is trying to protect me from?