This is rather a design problem. I don't know how to achieve this in Akka
User Story
- I need to parse big files (> 10 million lines) which look like
2013-05-09 11:09:01 Local4.Debug 172.2.10.111 %MMT-7-715036: Group = 199.19.248.164, IP = 199.19.248.164, Sending keep-alive of type DPD R-U-THERE (seq number 0x7db7a2f3)
2013-05-09 11:09:01 Local4.Debug 172.2.10.111 %MMT-7-715046: Group = 199.19.248.164, IP = 199.19.248.164, constructing blank hash payload
2013-05-09 11:09:01 Local4.Debug 172.2.10.111 %MMT-7-715046: Group = 199.19.248.164, IP = 199.19.248.164, constructing qm hash payload
2013-05-09 11:09:01 Local4.Debug 172.2.10.111 %ASA-7-713236: IP = 199.19.248.164, IKE_DECODE SENDING Message (msgid=61216d3e) with payloads : HDR + HASH (8) + NOTIFY (11) + NONE (0) total length : 84
2013-05-09 11:09:01 Local4.Debug 172.22.10.111 %MMT-7-713236: IP = 199.19.248.164, IKE_DECODE RECEIVED Message (msgid=867466fe) with payloads : HDR + HASH (8) + NOTIFY (11) + NONE (0) total length : 84
- For each line I need to generate some
Event
that will be sent to server.
Question
- How can I read this log file efficiently in Akka
model? I read that reading a file synchronously is better because of less magnetic tape movement.
- In that case, there could be FileReaderActor
per file, that would read each line and send them for processing to lets say EventProcessorRouter
and Router
may have many actors working on line
(from file) and creating Event
. There would be 1 Event
per line
- I was also thinking of sending Event
s in batch to avoid too much data transfer in network. In such cases, where shall I keep accumulating these Events
? and How would I know if I all Events
are generated from inputFile
?
Thanks