I have a live feed of logging
data coming in through the network. I need to calculate live statistics, like the one in my previous question. How would I design this module? I mean, it seems unrealistic (read, bad design) to keep applying a groupby
function to the entire df
every single time a message arrives. Can I just update one row and its calculated column gets auto-updated?
JFYI, I'd be running another thread that will print read values from the df
and print to the a webpage every 5 seconds or so..
Of course, I could run groupby-apply
every 5 seconds instead of doing it in real time, but I thought it'd be better to keep the df and the calculation independent of the printing module.
Thoughts?