Identity of simulation objects in Haskell

Question

Writing a simulation in an object-oriented language, each object has an identity--that is, a way to distinguish it from every other object in the simulation, even if other objects have the exact same attributes. An object retains its identity, no matter how much it changes over time. This is because each object has a unique location in memory, and we can express that location with pointers or references. This works even if you don't impose an additional identity system like GUIDs. (Which you would often do to support things like networking or databases which don't think in terms of pointers.)

I don't believe there is an equivalent concept in Haskell. So, would the standard approach be to use something like GUIDs?

Update to clarify the problem: Identity is an important concept in my problem domain for one reason: Objects have relationships to each other, and these must be preserved. For example, Haskell would normally say a red car is a red car, and all red cars are identical (provided color is the only attribute cars have). But what if each red car must be linked to its owner? And what if the owner can repaint his cars?

Final update synthesizing the answers: The consensus seems to be that you should only add identifiers to data types if some part of the simulation will actually use those identifiers, and there's no other way to express the same information. E.g. for the case where a Person owns multiple Cars, each of which has a color, a Person can keep a list of immutable Cars. That fully expresses the ownership relationship as long as you have access to the Person.

The Person may or may not need some kind of unique identifier. One scenario where this would occur is: There's a function that takes a Car and a collection of all Persons and imposes a ParkingTicket on the appropriate Person. The Car's color cannot uniquely identify the Person who gets the ticket. So we can give the Person an ID and have the Car store it.

But even this could potentially be avoided with a better design. Perhaps our Cars now have an additional attribute of type ParkingPosition, which can be evaluated as legal or illegal. So we pass the collection of Persons to a function that looks at each Person's list of Cars, checks each one's ParkingPosition, and imposes the ParkingTicket on that Person if appropriate. Since this function always knows which Person it's looking at, there's no need for the Car to record that info.

So in many cases, assigning IDs is not as necessary as it first may seem.

Can you go into more detail about what you are trying to achieve? Object identity is obviously useful when your objects are mutable, but why would you want to distinguish between identical values in Haskell? — fjh, Oct 07 '13 at 16:22
Because sometimes those objects have relationships (like ownership) with other objects, and those relationships must be preserved. E.g. there might be two projectiles "owned" by two different launchers. Those projectiles need some way to keep track of who owns them. They can't just have a pointer to their owner, thus the need for a different strategy. — rlkw1024, Oct 07 '13 at 17:26
@Jarret Why can you not differentiate them by their launcher? What more do you need? — itsbruce, Oct 07 '13 at 17:58

score 3 · Answer 1 · answered Oct 07 '13 at 16:24

My approach would be to store all state information in a data record, like

data ObjState = ObjState
    { objStName :: String
    , objStPos :: (Int, Int)
    , objStSize :: (Int, Int)
    } deriving (Eq, Show)

data Obj = Obj
    { objId :: Int
    , objState :: ObjState
    } deriving (Show)

instance Eq Obj where
    obj1 == obj2 = objId obj1 == objId obj2

And the state should be managed by the API/library/application. If you need true pointers to mutable structures, then there are built-in libraries for it, but they're considered unsafe and dangerous to use unless you know what you're doing (and even then, you have to be cautious). Check out the Foreign modules in base for more information.

itsbruce · Accepted Answer · 2013-10-07T18:29:56.483

Why do you want to "solve" this non-problem? Object identity is a problem with OO languages which Haskell happily avoids.

In a world of immutable objects, two objects with identical values are the same object. Put the same immutable object twice into a list and you have two different objects wherever you want to see things that way (they "both" contribute to the total number of elements, for example, and they have unique indexes) without any of the problems that Java-style reference equality causes. You can even save that list to a database table and get two different rows, if you like. What more do you need?

UPDATE

Jarret, you seem to be convinced that the objects in your model must have genuinely separate identities just because real life ones would be distinct. However, in a simulation, this only matters in the contexts where the objects need to be differentiated. Generally, you only need unique identifiers in a context where objects must be differentiated and tracked, and not outside those contexts. So identify those contexts, map the lifecycle of an object that is important to your simulation (not to the "real" world), and create the appropriate identifiers.

You keep providing answers to your own questions. If cars have owners, then Bob's red car can be distinguished from Carol's red car. If bob can repaint his car, you can replace his red car with a blue car. You only need more if

Your simulation has cars without owners
You need to be able to distinguish between one ownerless red car and another.

In a simple model, 1 may be true and 2 not. In which case, all ownerless red cars are the same red car so why bother making them distinct?

In your missile simulation, why do missiles need to track their owning launchers? They're not aimed at their launchers! If the launcher can continue to control the missile after it is launched, then the launcher needs to know which missiles it owns but the reverse is not true. The missile just needs to know its trajectory and target. When it lands and explodes, what is the significance of the owner? Will it make a bigger bang if it was launched from launcher A rather than launcher B?

Your launcher can be empty or it can have n missiles still available to fire. It has a location. Targets have locations. At any one time there are k missiles in flight; each missile has a position, a velocity/trajectory and an explosive power. Any missile whose position is coincident with the ground should be transformed into an exploding missile, or a dud etc etc.

In each of those contexts, which information is important? Is the launcher identity really important at detonation time? Why? Is the enemy going to launch a retaliatory strike? No? Then that's not important information for the detonation. It probably isn't even important information after launch. Launching can simply be a step where the number of missiles belonging to Launcher A is decremented while the number of missiles in flight is incremented.

Now, you might have a good answer to these questions, but you should fully map your model before you start lumbering objects with identities they may not need.

"In the world of immutable objects, two objects with identical values are the same object"--But in many systems we'd like to model, that's not true. Objects may have the same attributes, yet be *related* to different objects. Like, you could have two brown cows, but their owners are different. Those brown cows are not the same object, and can't be treated as such. Thus the need for some way to uniquely identify the related object. — rlkw1024, Oct 07 '13 at 17:28
Great! You have the owners names, so put them in a map, with the owner as key. I am not joking; if, in your model, the only way to differentiate cows is by their owner, that is all you need. **First** identify the various contexts in which these cows need to be differentiated, *then* choose the appropriate solution. Don't start with a broken model. — itsbruce, Oct 07 '13 at 17:41
Thanks! Suppose the cows' owners can change their names. In this case, would you still recommend against using an arbitrary numerical identifier to link the objects? I suppose if you still keyed it to the name, you'd have to build a new map whenever an owern's name changed. — rlkw1024, Oct 07 '13 at 17:56
Does your simulation *really* care whether a cow is the same cow? If every cow has an owner, every owner feeds his or her own cows, if a change of ownership means one owner now has two cows and another now has one, *does it matter*? If you really need to track the life history of each cow, then probably yes but then give the cows unique identifiers. However, most objects in your simulation do not need that. If the cows are fed bales of hay, would you insist on each bale being a distinct object? — itsbruce, Oct 07 '13 at 18:04
@Jarrett Or you could just add an "ID" field to your cow type, and generate unique IDs for each cow. Even in OOP, if you had a class for cow with the attributes name, color, and weight, you wouldn't be able to tell two identical cows apart with a unique ID. This isn't a problem in OOP or FP, it's just a problem in programming. — bheklilr, Oct 07 '13 at 18:05

score 1 · Answer 3 · answered Oct 07 '13 at 17:05

In Haskell the concepts of values and identities are decoupled. All variables are simply immutable bindings to values.

There are a few types whose value is a mutable reference to another value, such as IORef, MVar and TVar, these can be used as identities.

You can perform identity checks by comparing two MVars and an equality check by comparing their referenced values.

An excellent talk by Rich Hickey goes in detail over the issue: http://www.infoq.com/presentations/Value-Values

wit · Answer 4 · 2013-10-07T18:46:13.523

You can always write:

> let mylist = take 25 $ cycle "a"
> mylist
"aaaaaaaaaaaaaaaaaaaaaaaaa"
> zip mylist [1..]
[('a',1),('a',2),('a',3),('a',4),('a',5),('a',6),('a',7),('a',8),('a',9),('a',10),
 ('a',11),('a',12),('a',13),('a',14),('a',15),('a',16),('a',17),('a',18),('a',19),
 ('a',20),('a',21),('a',22),('a',23),('a',24),('a',25)]

If we are not joking - save it as part of data

data MyObj = MyObj {id ::Int, ...}

UPDATED

If we want to work with colors and ids separately, we can do next in Haskell:

data Color = Green | Red | Blue deriving (Eq, Show)

data Car = Car {carid :: Int, color :: Color} deriving (Show)

garage = [Car 1 Green, Car 2 Green, Car 3 Red]

redCars = filter ((== Red) . color) garage

greenCars = filter ((== Green) . color) garage

paint2Blue car = car {color=Blue}

isGreen = (== Green) . color

newGarage = map (\car -> if isGreen car then paint2Blue car else car) garage

And see result in gchi:

> garage
[Car {carid = 1, color = Green},Car {carid = 2, color = Green},Car {carid = 3, color = Red}]
> redCars
[Car {carid = 3, color = Red}]
> greenCars
[Car {carid = 1, color = Green},Car {carid = 2, color = Green}]
> newGarage
[Car {carid = 1, color = Blue},Car {carid = 2, color = Blue},Car {carid = 3, color = Red}]

Identity of simulation objects in Haskell

4 Answers4

Linked