I'm running a python script that handles and processes data using Pandas functions inside an infinite loop. But the program seems to be leaking memory over time.
This is the graph produced by the memory-profiler package:
Sadly, I cannot identify the source of the increasing memory usage. To my knowledge, all data (pandas timeseries) are stored in the object Obj
, and I track the memory usage of this object using the pandas function .memory_usage
and the objsize function get_deep_size()
. According to their output, the memory usage should be stable around 90-100 MB. Other than this, I don't see where memory can ramp up.
It may be useful to know that the python program is running inside a docker container.
Below is a simplified version of the script which should illuminate the basic working principle.
from datetime import datetime
from time import sleep
import objsize
from dateutil import relativedelta
def update_data(Obj, now_utctime):
# attaining the newest timeseries data
new_data = requests.get(host, start=Obj.data[0].index, end=now_utctime)
Obj.data.append(new_data)
# cut off data older than 1 day
Obj.data.truncate(before=now_utctime-relativedelta.relativedelta(days=1))
class ExampleClass():
def __init__(self):
now_utctime = datetime.utcnow()
data = requests.get(host, start=now_utctime-relativedelta.relativedelta(days=1), end=now_utctime)
Obj = ExampleClass()
while True:
update_data(Obj, datetime.utcnow())
logger.info(f"Average at {datetime.utcnow()} is at {Obj.data.mean()}")
logger.info(f"Stored timeseries memory usage at {Obj.data.memory_usage(deep=True)* 10 ** -6} MB")
logger.info(f"Stored Object memory usage at {objsize.get_deep_size(Obj) * 10 ** -6} MB")
time.sleep(60)
Any advice into where memory could ramp up, or how to further investigate, would be appreciated.
EDIT: Looking at the chart, it makes sense that there will be spikes before I truncate, but since the data ingress is steady I don't know why it wouldn't normalize, but remain at a higher point. Then there is this sudden drop after every 4th cycle, even though the process does not have another, broader cycle that could explain this ...