so my situation, I have been trying to distribute the load of python
based some data science pipelines and, after much searching and some QA (Scale out jobs with high memory consumption but low computing power within AWS and using Docker: finding best solution) I have come to the conclusion that Nuclio might be a good fit, most likely built on top of kubernetes
. still a major question remains:
say I want to do this:
@run_with_nuclio
def step_1(context):
# in the docker image
import pandas as pd
# in my project, using submodules
from my_big_submodule_1 import do_this
from my_big_submodule_2 import do_that
do_this()
do_that()
I had major "context" problems in the past so right now my super-bright, super-safe solution is to package (literally getting all py files in the project, zip them) pass them to the function to be executed (in the remote environment), unzip them there and then run the function.
This is "good" because it provides me with immense flexibility. But this spaghetti solution is not the way to go.
Is there a way to do this by leveraging Nuclio framework? (functions seem to have all the info within, never calling external packages not present in the "classic" packages)