I have a Spark application written in Scala in which I have a Dataset[Event] where Event is a user-defined type, something like this:
case class Event(timestamp: Long, state: String, source: String)
which I am transforming to this:
case class TransformedEvent(timestamp: Long, state: String, source: String, is_finished: Boolean)
Basically, I am adding one field "is_finished" based on the other fields.
Example: is_finished = true if
state = "state1"
AND
source = "source1"
etc.
For a better explanation, here is the code:
val events: Dataset[Event] = getEvents()
// Here is the transformation
val transformedEvents: Dataset[TransformedEvent] = events.map(e => convert(e))
// where the convert function is something like this
def convert(event: Event): TransformedEvent = {
val isFinished = if(event.state == "state1" && event.source = "source1")
TransformedEvent(timestamp = event.timestamp,
state = event.state,
source = event.source,
is_finished = isFinished)
}
I am trying to figure out a way to make conditions like this
event.state == "state1" && event.source = "source1"
config-driven because I might have to add/delete/update these in the future and so do not want to make changes to the code and deploy each time such a scenario occurs.
Can anyone point me in the right direction?
Thanks in advance.