2

The short version:

I need to call a function in code I can't modify. The function takes an obj[], and I want to pass it a 'T[]. I could use Array.map box, but I'm trying to avoid creating an intermediate array. Is there a direct way to convert a 'T[] to obj[] without passing through Array.map box or any other code that would create an intermediate array?

The long version:

I'm trying to write code that needs to interoperate with the PersistentVector class from FSharpx.Collections. (Specifically, I'm trying to implement RRB-Trees in F#). PersistentVector is basically a B-tree with a branching factor of 32. Each node in the tree contains one of two things: either other nodes (if the node is NOT a leaf node), or the items stored in the tree (if the node IS a leaf node). Now, the most natural way to represent this data structure in F# would be with a discriminated union like type Node<'T> = TreeNode of Node[] | LeafNode of 'T[]. But for what I assume are performance reasons, the FSharpx.Collections.PersistentVector code instead defines its Node class as follows:

type Node(thread,array:obj[]) =
    let thread = thread
    new() = Node(ref null,Array.create Literals.blockSize null)
    with
        static member InCurrentThread() = Node(ref Thread.CurrentThread,Array.create Literals.blockSize null)
        member this.Array = array
        member this.Thread = thread
        member this.SetThread t = thread := t

The threading code is unrelated to my current problem (it's used in transient vectors, which allow certain performance improvements), so let's remove it for the sake of creating the simplest summary of the problem. After removing the thread-related code, we have a Node definition that looks like this:

type Node(array:obj[]) =
    new() = Node([||])
    with member this.Array = array

I want my implementation of RRB Trees to interoperate smoothly with the existing PersistentVector class, because the set of all valid PersistentVector trees is a strict subset of the set of all valid RRB trees. As part of that implementation, I have an RRBNode class that inherits from Node (and therefore must also take an obj[] parameter in its constructor), and I often need to create new instances of either Node or RRBNode. For example, my implementation of RRBTree.ofArray basically looks like this:

let ofArray<'T> (arr:'T[]) =
    let leaves = arr |> Array.chunkBySize 32 |> Array.map Node
    // More code here to build a tree above those leaf nodes

Or rather, I would like to define it like that, but I can't. The code above gives me a type mismatch error on the Array.map Node call. The Node constructor takes an obj[], and the error message reports that "the type 'T[] is not compatible with the type obj[]".

One approach I tried to solve this problem was to use box and unbox. https://stackoverflow.com/a/7339153/2314532 led me to believe that piping an array of any type through box followed by unbox would lead to casting that array to obj[]. Yes, this is basically a misfeature of the .Net type system that compromises type safety (the cast that passes at compile time might fail at runtime) — but because I need to interoperate with the Node class from PersistentVector, I don't have the benefits of type safety anyway (since Node has used obj instead of a discriminated union). So for this one part of my code, I actually want to tell the F# compiler "Stop protecting me here, please, I know what I'm doing and I've written extensive unit tests". But my attempt to use the box >> unbox approach failed at runtime:

let intArray = [|1;2;3;4;5|]
let intNode = Node(intArray) // Doesn't compile: Type mismatch. Expecting obj[] but got int[]

let objArray : obj[] = intArray |> box |> unbox // Compiles, but fails at runtime: InvalidCastException
let objNode = Node(objArray)

(I made the type of objArray explicit to make reading this minimal example as easy as possible, but I didn't need to write it: F# correctly infers its required type from the call to Node(objArray) on the next line. The equivalent part of my actual code doesn't have explicit type annotations, but the obj[] array type is still inferred, and it's that same int[] to obj[] cast, via |> box |> unbox, that is causing the InvalidCastException in my actual code.)

Another approach that would probably work would be to insert a call to Array.map box into my Node-creating pipeline:

let ofArray<'T> (arr:'T[]) =
    let leaves = arr |> Array.chunkBySize 32 |> Array.map (Array.map box >> Node)
    // More code here to build a tree above those leaf nodes

This does what I want (creates an array of Node instances, which will become the leaves in the tree), but it creates an extra intermediate array in the process. I'd like to let the chunked arrays become the Node arrays directly, otherwise I'll be burning through O(N) memory and creating unnecessary GC pressure. I've thought about using Seq.cast at some point in the pipeline, but I'm worried about the performance effects of using Seq.cast. Turning arrays of known size (here, 32) into seqs means that other code, which needs arrays (to create Node instances), would have to call Array.ofSeq first, and Array.ofSeq is implemented with ResizeArray since it can't count on the size of the seqs in the general case. There's an optimization for seqs that are already arrays, but even that version of Array.ofSeq creates a new array as its return value (which is precisely the correct behavior for the general case, but precisely what I'm trying to avoid here).

Is there any way for me to cast my 'T[] arrays to obj[], deliberately forfeiting type safety, without creating the intermediate arrays that I've been trying so hard to avoid? Or am I going to have to write this one bit of code in C# so that I can do the unsafe things that the F# compiler is trying to protect me from?

Community
  • 1
  • 1
rmunn
  • 34,942
  • 10
  • 74
  • 105

1 Answers1

5

There are two possible outcomes depending on whether 'T is a value or a reference type.

Reference Types

If 'T is a reference type, then your box unbox trick is going to work just fine:

let strArray = [|"a";"b";"c";"d";"e"|]
let objArray : obj[] = strArray |> box |> unbox
val strArray : string [] = [|"a"; "b"; "c"; "d"; "e"|]
val objArray : obj [] = [|"a"; "b"; "c"; "d"; "e"|]

Value Types

If 'T is a value type then, as you've noticed, the conversion will fail at runtime.

There is simply no way to make that conversion succeed because the value types in the array haven't been boxed. There is no possible way to circumvent the type system and convert to obj[] directly. You are going to have to do it explicitly for each element.

let intArray = [|1; 2; 3; 4; 5|]
let objArray : obj[] = intArray |> Array.map (box)

Handling both

You could write a generic conversion function to check whether the type is a reference or a value type and then perform the appropriate conversion:

let convertToObjArray<'T> (arr : 'T[]) =
    if typeof<'T>.IsValueType then
        arr |> Array.map (box)
    else
        arr |> box |> unbox

Usage:

convertToObjArray strArray
val it : obj [] = [|"a"; "b"; "c"; "d"; "e"|]
convertToObjArray intArray
val it : obj [] = [|1; 2; 3; 4; 5|]
TheInnerLight
  • 12,034
  • 1
  • 29
  • 52
  • "Work just fine" depends on what the callee does. E.g. normally calling `arr.[0] <- obj()` would be fine given `arr:obj[]` of length at least one, but that will throw an exception with the boxed and unboxed array. – kvb Apr 06 '17 at 13:59
  • @kvb - In this case, the other parts of the code are doing things like `arr.[0] <- box item` where `item` is a value of type `'T`. Whether `'T` is a value or reference type, this will work. (I checked.) – rmunn Apr 06 '17 at 14:17
  • I just verified that this answer is indeed correct, and solves my problem. Thanks! – rmunn Apr 06 '17 at 14:17
  • @kvb That's true but is a broader problem with array covariance in .NET. Perhaps I should call that out specifically though. – TheInnerLight Apr 06 '17 at 14:36