0

This is the question about the earlier Persistence class that I was trying to expose as an enumerator. I realized that I need to pass by reference really to change the value of of the object that I am trying to populate. I guess I am going about this in a C++ way (As most may have guessed I am an F# beginner). However, I want to be as efficient in terms of memory foot print as I can. Ideally I would like to reuse the same object over and over again when I read from a file.

I am having a problem with this code where it does not allow me to pass by reference in the call to the function serialize. I am again reproducing the code here. I thank you in advance for your help.

The error I get:

error FS0001: This expression was expected to have type byref<'T> but here has type 'T

If I change the call to serialize(& current_, reader_) I get the following error:

persistence.fs(71,6): error FS0437: A type would store a byref typed value. This is not permitted by Common IL.
persistence.fs(100,29): error FS0412: A type instantiation involves a byref type. This is not permitted by the rules of Common IL.
persistence.fs(100,30): error FS0423: The address of the field current_ cannot be used at this point

The CODE:

type BinaryPersistenceIn<'T when 'T: (new : unit -> 'T)>(fn: string, serializer: ('T byref * BinaryReader) -> unit) =
let stream_ = File.Open(fn, FileMode.Open, FileAccess.Read)
let reader_ = new BinaryReader(stream_)
let mutable current_ = new 'T()

let eof() =
     stream_.Position = stream_.Length


interface IEnumerator<'T> with

    member this.Current
        with get() = current_ 

    member this.Dispose() =
        stream_.Close()
        reader_.Close() 

interface System.Collections.IEnumerator with

    member this.Current
        with get() = current_ :> obj

    member this.Reset() = 
        stream_.Seek((int64) 0., SeekOrigin.Begin) |> ignore

    member this.MoveNext() = 
        let mutable ret = eof()
        if stream_.CanRead && ret then
            serializer( current_, reader_)

        ret
ildjarn
  • 62,044
  • 9
  • 127
  • 211
Ramesh Kadambi
  • 546
  • 6
  • 17

2 Answers2

1

You can circumvent this by introducing a mutable local, passing it to serialize, and then assigning back to current_:

 member this.MoveNext() = 
    let mutable ret = eof()
    if stream_.CanRead && ret then
        let mutable deserialized = Unchecked.defaultof<_>
        serializer( &deserialized, reader_)
        current_ <- deserialized

    ret

But now this is becoming really, really unsettling. Notice the use of Unchecked.defaultof<_>? There is no other way to initialize a value of unknown type, and it's called "unchecked" for a reason: the compiler can't guarantee safety of this code.

I strongly advise that you explore other ways of achieving your initial goal, such as using a seq computation expression instead, as I have suggested in your other question.

Community
  • 1
  • 1
Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172
  • I like your sequence expression. But I have noticed that with my C# reader that allocates a struct on de-serialize it slows down with time. – Ramesh Kadambi May 08 '15 at 20:28
  • I assume with struct you are referring to is a `'T` (which can be struct). Now I find that a bit strange as you are implicitly creating a new struct when calling Current as a struct in .NET have value semantics. That means Current will create a copy. It would make more sense to me that is slowed down over time if you were allocating more and more "classes" as they are "heap" objects and thus needs GC:ing. – Just another metaprogrammer May 08 '15 at 22:21
  • The main problem with `Unchecked.defaultof<>` is that it will return `null` values for F# reference types that doesn't allow null literals. It's not as as "bad" as `FormatterServices.GetUninitializedObject` which returns a non-constructed object. – Just another metaprogrammer May 08 '15 at 22:24
0

With respect to memory footprint, let's analyze the sequence option:

  • You have an instance of the seq. That's going to be some class implementing IEnumerable<'T>. This one will be held until you no longer need the seq, i.e. not reallocated each time.
  • You hold a Stream as part of the seq, with the same lifetime.
  • You hold a BinaryReader as part of the seq, with the same lifetime.
  • eof : unit -> bool is a compiler-generated function class as part of the seq, with the same lifetime.
  • The loop will use a bool for the while loop and the if condition. Both of which are stack-allocated structs and needed for the branching logic.
  • And finally, you yield an instance that you already got from the serializer.

Conceptually, that's as little memory consumption as you can have for a lazily evaluated seq. Once an element is consumed, it can be garbage collected. Multiple evaluations will do the same thing again.

The only thing you can actually play with, is what the serializer returns.

If you have your serializer return a struct, it is copied and stack-allocated. And it should not be mutable. Mutable structs discouraged. Why are mutable structs “evil”?

Structs are good with respect to the garbage collector as they avoid garbage collection. But they are typically to be used with very small objects, in the order of say 16-24 bytes max.

Classes are heap-allocated and are passed by reference always. So if your serializer returns a class, say a string, then you just pass that around by reference and overhead of copying will be very small as you only ever copy the reference, not the content.

If you want your serializer side-effecting, i.e. overwriting the same object (class, i.e. reference type is to be used for this), then the whole approach of IEnumerable<'T> and consequently seq is wrong. IEnumerables always give you new objects as result and should never modify any pre-existing object. The only state with them should be the information, at what place in the enumeration they are.

So if you need a side-effecting version, you could do something like (pseudo-code).

let readAndOverwrite stream target =
    let position = // do something here to know the state
    fun target -> 
        target.MyProp1 <- stream.ReadInt()
        target.MyProp2 <- stream.ReadFloat()

Passing as byref does not seem very reasonable to me, as you then anyway allocate and garbage collect the object. So you can just as well do that in an immutable way. What you can do, is just modifying properties on your object instead.

Community
  • 1
  • 1
Daniel Fabian
  • 3,828
  • 2
  • 19
  • 28
  • I think you are mistaken about lifetime of the stream. It will be created anew for every new enumeration. Which exactly matches the OP's class-based implementation. You can think of the `seq` itself as corresponding to the class (with bound constructor arguments) and of enumerators produced from it as corresponding to instances of the class. – Fyodor Soikin May 10 '15 at 16:09
  • @Daniel Fabian, Thank you for your comments and elaboration. I will look into your proposal of readAndOverwrite. I like it and seems really elegant. So you suggest that I return a function. Given that I am an F# and functional programming novice. It will be a bit before I digest your solution and implement it. – Ramesh Kadambi May 11 '15 at 20:45
  • @Daniel Fabian, I read the article "why mutable structs are evil?" and Mr. Lipperts article as well. I actually think his conclusion is wrong. Instead of advocating care when using structs, branding them as evil is not the right approach. I do not understand the notion of a non-mutable struct. What is that? A struct vs a class/type in my mind (forgive my C++ perspective) is that one has encapsulation and the other does not. The boundary should not be breached. – Ramesh Kadambi May 12 '15 at 18:11
  • A struct, in .net is a value type, whereas a class is a reference type. That is different from C++. And a mutable struct, would be a value type, i.e. copy semantics, that allows mutation. Think say a point with x and y components where you can modify a coordinate. An immutable point on the other hand would mean, that you create a new struct with changed values instead. – Daniel Fabian May 12 '15 at 19:48
  • @Daniel, I lost you when you said "A mutable struct, would be a value type", are value types in .net immutable? (immutability enforced by language) not really. If I understand it correctly, value type implies one important thing, they are copied around on assignment. I do not understand where immutability enters the picture. – Ramesh Kadambi May 12 '15 at 22:03
  • Well, the whole point of just not mutating structs is about it not behaving in an intuitive way. When you modify an object, you'd expect every reference of it to be modified, which happens if you do it with classes. Mutating a struct, however, would only change your local copy. That's why it is suggested to not make mutable structs. All typical structs like DateTime, int, long, float, etc. are always immutable, i.e. the struct is written in such a way, that once created, you cannot modify the value. – Daniel Fabian May 12 '15 at 22:27