6

I am using mlflow to register my model. I try to use 'Scenario 4' when artifacts load to S3 bucket from local.

  1. Add credentials of S3 bucket to .aws/credentials

  2. Set endpoint and mlflow URI:

    os.environ["MLFLOW_S3_ENDPOINT_URL"]='https://storage.yandexcloud.net' os.environ["MLFLOW_TRACKING_URI"]='http://:8000'

  3. Log model to S3 via mlflow:

    import mlflow import mlflow.sklearn mlflow.set_experiment("my") ... mlflow.sklearn.log_model(model, artifact_path="models_mlflow")

But get error:

MlflowException: API request to http://<IP>:8000/api/2.0/mlflow-artifacts/artifacts/6/95972bcc493c4a8cbd8432fea4cc8bac/artifacts/models_mlflow/model.pkl failed with exception HTTPConnectionPool(host='62.84.121.234', port=8000): Max retries exceeded with url: /api/2.0/mlflow-artifacts/artifacts/6/95972bcc493c4a8cbd8432fea4cc8bac/artifacts/models_mlflow/model.pkl (Caused by ResponseError('too many 503 error responses'))
sergzemsk
  • 164
  • 3
  • 12

1 Answers1

0

If there are no network problems (with the load balancer / router / ingress or route) and the S3 credentials are correct and accessible to mlflow client AND S3 credentials are accessible to the client (files do not pass through mlflow server but are sent directly from the client's local storage to S3 buckets), then another typical cause of such HTTP 503 errors (and in this case from any containerized app, not just MLflow server) is... memory exhaustion (including excessively quick memory allocation, e.g. reading huge amounts of data from the pickle format).

mirekphd
  • 4,799
  • 3
  • 38
  • 59