Current state
I have two data types.
data Foo = Foo
{ fooId :: RecordId Foo
, bars :: [RecordId Bar]
...
}
data Bar = Bar
{ barId :: RecordId Bar
...
}
This schema allows for each Foo to refer to an arbitrary list of Bars. Clearly, Bars can be shared among any number of Foos, or no Foos.
I already have data persisted in acid-state that uses this type of schema structure.
Desired state
data Foo = Foo
{ fooId :: RecordId Foo
...
}
data Bar = Bar
{ barId :: RecordId Bar
, fooId :: RecordId Foo
...
}
In the desired state, each Bar must have exactly one Foo, as in common many-to-one SQL foreign key relationships.
The Problem
Now of course, there is no way to perfectly transition between these two states, as the latter is less expressive than the former. However, I can write code that deals with any ambiguity here (for duplicate references, prefer the Foo with the smallest fooId, and simply delete any Bars that are not referenced by a Foo).
My issue is I cannot see any path, using Safecopy, to migrate between these two schemas. As far as I can tell, Safecopy defines migrations as pure functions between types and I cannot query the state of acid-state inside a migrate function. What I need here, though, is a migration that runs once, on the state at a specific point in time, and converts one schema into the other. With a database this would be trivial, but with acid-state I just can't see my way forward.
The only inkling towards a solution that I have is to have a separate program (or, say, command line feature callable from the main program) compiled specifically to run the few lines of code necessary to handle the data migration (so, say, all Foov0, Barv0 are converted to Foov1,Barv1) and then simply swap in the new schema in my main program.
However, I don't even see how this could work. In my understanding of safecopy, if I defined migrations to the new schema in the normal way then as soon as I try to access the data I will be given an instance of the new data type, which of course does not contain the data I need to actually migrate the data.
One (clumsy, it seems to me) option might be to define two further data types, copy the data across to them, then change the schema and run a migration that copies data back across to the new schema, then remove the further data types. Which requires three compilations of the program to run on the data sequentially, which somehow does not seem very elegant!
Any pointers would be greatly appreciated.
Edit: Possible Solution
I neglected to mention that the schema above is wrapped in a data type that represents the entire state of the program, like
data DB = DB {
dbFoos :: [Foo],
dbBars :: [Bar]
}
I think this means that all I need to do is to define a new data DB and write a migration from DBv0 to DB, handling my data there without any need for sequencing or monadic activity. I will experiment with this and post this as an answer if successful.