I have a model (loaded into memory) live in production that consumes messages/data from message queue for make a prediction. I have a separate process that retrains the model every few hours (necessary). What is the best way to trigger model to reload newly trained version into memory every-time retraining occurs? Currently I just have the production model reload on an interval or every 1000 messages.
I figured this would be easier if instead of a message queue I have a webserver. So I can just have an endpoint that can trigger reload. It's hard to find best practices on this topic.