3

Let's say I want an immutable model, a world. How one should model references?

case class World(people: Set[Person])

case class Person(name: String, loves: Option[Person])

val alice = Person("Alice", None)
val peter = Person("Peter", Some(alice))
val myWorld = World(Set(alice, peter))

println(myWorld)

Outputs:

World(Set(Person(Alice,None), Person(Peter,Some(Person(Alice,None)))))

But now we have two separate persons named Alice (in the people set and in the peter person).

What is the best practice(s) on approaching this referencing in an immutable model in Scala?

I thought about referencing strictly through ids, but it doesn't feel right. Is there a better way? (Also current implementation doesn't support recursion/circle like A loves B and B loves A.)

monnef
  • 3,903
  • 5
  • 30
  • 50

3 Answers3

1

Although alice is printed two times, it only exists as one and the same value in your example. In general, you would introduce perhaps an id field that carries a unique identifier if you want to trace the mutation of an otherwise immutable object. But here, clearly, you only have one value.

For recursive references, see this question. For example, you could use a by-name parameter.

0__
  • 66,707
  • 21
  • 171
  • 266
  • Yes, you are right. My point is I can now change one alice without this change being carried to all alices which is wrong, because it is supposed to be a same person. – monnef Jan 31 '16 at 05:41
  • 1
    Or give up the immutable model for another. [STM](https://nbronson.github.io/scala-stm/) is very useful. – 0__ Jan 31 '16 at 09:51
1

I think you have to distinguish between pure values, and things that have a notion of identity that survives state changes.

A person might be something in the latter category, depending on the requirements of your model. E.g. if the age of a person changes, it is still the same person.

For identifying entities over state changes, I don't think there is anything wrong with using some kind of unique identifier. Depending on your model, it might be a good idea to have a map from person id to person state at top level in your model, and then express relationships between persons either in the person state, or in a separate data structure.

Something like this:

case class Person(name: String, age: Int, loves: Set[PersonRef])

case class PersonRef(id: Long) // typesafe identifier for a person

case class World(persons: Map[PersonRef, Person])

Note that a person state does not contain the ID, since two persons with different IDs could have the same state.

A problem with this approach is that the world could be in an inconsistent state. E.g. somebody could love a person that does not exist in the world. But I don't really see a way around this.

I think it might be worth looking at scala libraries that are confronted with a similar problem. E.g.

diode has the concept of a reference to a value elsewhere in the model

scala graph allows to define custom node and edge types.

Rüdiger Klaehn
  • 12,445
  • 3
  • 41
  • 57
  • Thank you for your answer. Now I realized that actors in Akka are referenced in a very similar way (`ActorRef`). – monnef Jan 31 '16 at 05:43
1

When modelling some application domain using immutable data structures, you should not use object identity to rely on anything. Just think about how you would update an immutable model: you would generate a modified copy which would have different identity, even if it represents the same "thing". How would you ensure, that you've set all references in your model to the new, modified copy?

Thus, you have to make identity explicit: ask yourself what is the identity of something, e.g. a unique id or a set of unique attributes, e.g. for a person name, date and location of birth an so on. (Be careful though, while the real date of birth of a person never changes, the one stored in your date model might because of an error in your data set).

Then use this information to point to an object from everywhere else. Making identity explicit requires you to think about it when you build your data model. This might feel like an extra burden but in fact it avoid a lot of trouble later on. For example, serialization will be easy, distribution will be easy, adding some kind of versioning support will be easy and so on.

As a rule you should only use references to the same information, if this information is just the same by coincidence. If you are pointing to the same "identity" and the information at both places has to be the same, use some explicit id.

dth
  • 2,287
  • 10
  • 17