Community,
Objective: I'm running a Pi project (i.e. Python) that communicates with an Arduino to get data from a load cell once a second. What data structure should I use to log (and do real-time analysis) on this data in Python?
I want to be able to do things like:
- Slice the data to get the value of the last logged datapoint.
- Slice the data to get the mean of the datapoints for the last n seconds.
- Perform a regression on the last n data points to get g/s.
- Remove from the log data points older than n seconds.
Current Attempts:
Dictionaries: I have appended a new key with a rounded time to a dictionary (see below), but this makes slicing and analysis hard.
log = {}
def log_data():
log[round(time.time(), 4)] = read_data()
Pandas DataFrame: this was the one I was hopping for, because is makes time-series slicing and analysis easy, but this (How to handle incoming real time data with python pandas) seems to say its a bad idea. I can't follow their solution (i.e. storing in dictionary, and df.append()
-ing in bulk every few seconds) because I want my rate calculations (regressions) to be in real time.
This question (ECG Data Analysis on a real-time signal in Python) seems to have the same problem as I did, but with no real solutions.
Goal:
So what is the proper way to handle and analyze real-time time-series data in Python? It seems like something everyone would need to do, so I imagine there has to pre-built functionality for this?
Thanks,
Michael