0

I am trying to build an AWS Sagemaker pipeline. In my root dir I have a process.py script and a utils.py script. In the process.py script I'm trying to import additional functions from utils.py. When I run the pipeline, the processing job fails because it can't find the module utils.

  File "/opt/ml/processing/input/code/process.py", line 4, in <module>
    from utils import (
ModuleNotFoundError: No module named 'utils'

I have tried setting up the framework processor as follows:

sklearn_processor = FrameworkProcessor(
    estimator_cls=SKLearn,
    framework_version=framework_version,
    instance_type="ml.m5.xlarge",
    instance_count=1,
    sagemaker_session=sagemaker_session,
    image_uri=image_uri,
    role=role,
)

step_args = sklearn_processor.run(
    inputs=[
        ProcessingInput(source=s3_bucket, destination="/opt/ml/processing/input"),
    ],
    outputs=[
        ProcessingOutput(output_name="train", source="/opt/ml/processing/train"),
        ProcessingOutput(output_name="test", source="/opt/ml/processing/test")
    ],
    code="process.py",
)

step_process = ProcessingStep(
    name="process-step",
    step_args=step_args,
)

I know in this question the answer was to specify the source_dir but in my case the utils.py and process.py scripts are already in the same directory (namely the root). Do I need to specify the source_dir as the root? If so how would I do that?

Sam Baker
  • 11
  • 2

0 Answers0