I have to implement some proprietary binary format and wanted to do this with scodec. However, I cannot find a concise solution. The format is as follows: A file consists of multiple Records, where each record is prefixed with a little endian 16-bit number "t"(uint16L). Records can be classified in 4 categories, depending on the values of the first and second byte of t:
- Normal: t.first != 0 && t.second == 0
- Bar: t.first == 0x08 && t.second == 0xFF
- Foo: t.first == 0x04 && t.second == 0x05
- Invalid: t is none of the above
If t is Invalid, the program should exit, as the file is corrupted. If t is either Normal or Bar, the length of the Record follows as 32-bit little endian int. If t is Foo, another 16-bit big endian int must be parsed, before the length can be parsed as 32-bit BE int.
- Normal: ("t" | uint16L) :: ("length" | uint32L) :: [Record data discriminated by t]
- Bar: ("t" | constant(0x08FF)) :: ("length" | uint32L) :: [Record data of Bar]
- Foo: ("t" | constant(0x0405)) :: uint16 :: ("length" | uint32) :: [Record data of foo]
- Invalid: ("t" | uint16L ) :: fail(Err(s"invalid type: $t"))
Furthermore, some values for t in "Normal" are unused and should produce an UnknownRecord (similar to the mpeg implementation here: https://github.com/scodec/scodec-protocols/blob/series/1.0.x/src/main/scala/scodec/protocols/mpeg/Descriptor.scala)
This is my current approach, but it does not feel clear and I get the feeling that I'm working more around scodec, than with it. Any ideas? Feel free to scrap my code below..
sealed trait ContainerType
object ContainerType{
implicit class SplitInt(val self: Int) extends AnyVal{
def first = self & 0xFF
def second = (self >> 8) & 0xFF
}
case object Normal extends ContainerType
case object Bar extends ContainerType
case object Foo extends ContainerType
case object Invalid extends ContainerType
val codec: Codec[ContainerType] = {
def to(value: Int): ContainerType = value match{
case v if value.first != 0 && value.second == 0 => Normal
case v if value.first == 0x08 && value.second == 0xFF => Bar
case v if value.first == 0x04 && value.second == 0x05 => Foo
case _ => Invalid
}
uint16L.xmap(to, ??) // don't have value here
// if I use case classes and save the value I can't discriminate by it in RecordPrefix
}
}
sealed trait RecordPrefix{
def t : Int,
def length: Int
}
object RecordPrefix {
case class Normal( override val t: Int, override val length: Int) extends RecordPrefix
object Normal{
val codec: Codec[Normal] = ??
}
case class Bar(override val t: Int, override val length: Int) extends RecordPrefix
object Bar{
val codec: Codec[Bar] = ??
}
case class Foo(override val t: Int, foobar: Int, length: Int) extends RecordPrefix
object Foo{
val codec: Codec[Foo] = ??
}
val codec: Codec[RecordPrefix] = {
discriminated[RecordPrefix].by(ContainerType.codec)
.typecase(Normal, Normal.codec)
.typecase(Bar, Bar.codec)
.typecase(Foo, Foo.codec)
// how to handle invalid case ?
}
}
case class Record(prefix: RecordPrefix, body: RecordBody)
sealed trait RecordBody
//.... How can I implement the codecs?
PS: This is my first question here, I hope it was clear enough. =)
Edit1: I found an implementation that does the job at least. I made tradeoff to check the Conditions again if the Record is unknown in order to get a cleaner hierarchy.
trait KnownRecord
sealed trait NormalRecord extends KnownRecord
case class BarRecord(length: Int, ..,) extends KnownRecord
object BarRecord {
val codec: Codec[BarRecord] = {
("Length" | int32L) ::
//...
}.as[BarRecord]
}
case class FooRecord(...) extends KnownRecord
object FooRecord {
val codec: Codec[FooRecord] = // analogue
}
case class A() extends NormalRecord
case class B() extends NormalRecord
// ...
case class UnknownRecord(rtype: Int, length: Int, data: ByteVector)
object UnknownRecord{
val codec: Codec[UnknownRecord] = {
("Type" | Record.validTypeCodec) ::
(("Length" | int32L) >>:~ { length =>
("Data" | bytes(length - 6)).hlist
})
}.as[UnknownRecord]
}
object Record{
type Record = Either[UnknownRecord, KnownRecord]
val validTypeCodec: Codec[Int] = {
uint16L.consume[Int] { rtype =>
val first = rtype & 0xFF
val second = (rtype >> 8) & 0xFF
rtype match {
case i if first != 0 && second == 0 => provide(i)
case i if first == 0x04 && second == 0x05 => provide(i)
case i if first == 0xFF && second == 0x08 => provide(i)
case _ => fail(Err(s"Invalid Type: $rtype!"))
}
} (identity)
}
def normalCodec(rtype: Int): Codec[NormalRecord] = {
discriminated[NormalRecord].by(provide(rtype))
.typecase(1, A.codec)
.typecase(2, B.codec)
.typecase(3, C.codec)
.typecase(4, D.codec)
.framing(new CodecTransformation {
def apply[X](c: Codec[X]) = variableSizeBytes(int32L, c.complete,
sizePadding=6)
})
}.as[NormalRecord]
val knownCodec: Codec[KnownRecord] = {
val b = discriminated[KnownRecord].by(("Type" | uint16L))
.typecase(0x0504, FooRecord.codec)
.typecase(0x08FF, BarRecord.codec)
(1 to 0xFF).foldLeft(b) {
(acc, x) => acc.typecase(x, normalCodec(x))
}
}
implicit val codec: Codec[Record] = {
discriminatorFallback(UnknownRecord.codec, knownCodec)
}
Edit2: I posted an alternate Solution as Answer below