1

I want to pass an GCP storage URL as argument when running my docker image, so that it can pull my csv file from my storage and print the dataset .

Below is my dockerfile

# Use the official lightweight Python image.
# https://hub.docker.com/_/python
FROM continuumio/miniconda3


# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

# Install production dependencies.
RUN pip install Flask gunicorn
RUN pip install scikit-learn==0.20.2 firefly-python==0.1.15
RUN pip install --upgrade google-cloud-storage
ENTRYPOINT ["python"]
CMD ["pre.py"]

I tried running the docker image by below command and getting below error

docker run preprocess:v1 "https://storage.googleapis.com/MYBucket/app/Model/IrisClassifier.sav"

.

python: can't open file 'https://storage.googleapis.com/MYBucket/app/Model/IrisClassifier.sav': [Errno 2] No such file or directory
import os
import argparse
from google.cloud import storage
from sklearn.externals import joblib
from urllib.request import urlopen

def parse_arguments():
    print('entered parse arg')
    parser = argparse.ArgumentParser()
    parser.add_argument('data_dir', type=str, help='GCSpath')
    args = parser.parse_known_args()[0]
    print('Argument passed')
    print(os.getcwd())
    print('STARTING CLOUD RETRIVAL')
    print('*****client initialized')
    dataset_load = joblib.load(urlopen(args.dat_dir))
    print('*****loaded Dataset')
    print(dataset_load)


def main(_):
    print("Prior to entering arg")
    parse_arguments()

I want to pass a similar GCP bucket when running my docker image https://storage.googleapis.com/MYBucket/app/Model/IrisClassifier.sav

furas
  • 134,197
  • 12
  • 106
  • 148
harish kumaar
  • 41
  • 1
  • 3

1 Answers1

0

you need to change all your CMD to ENTRYPOINT at first:

FROM continuumio/miniconda3

ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

RUN pip install Flask gunicorn
RUN pip install scikit-learn==0.20.2 firefly-python==0.1.15
RUN pip install --upgrade google-cloud-storage
ENTRYPOINT ["python", "pre.py"]

then you can pass your URL.

The Problem with your setup is:

docker will start the entrypoint and that is python and with your command you overwrite the CMD wich will give you:

python YOUR_URL

Update

I do not know if you add if statement to run the main def but here how you schould edit the script:

def main():
    print("Prior to entering arg")
    parse_arguments()


if __name__ == '__main__':
    main()
LinPy
  • 16,987
  • 4
  • 43
  • 57
  • I cant understand , can you please explain as I am new to docker – harish kumaar Nov 04 '19 at 07:21
  • `docker run preprocess:v1 YOUR_URL` will overwrite the `CMD` in your image therefore you will end up with `python YOUR_URL` without `pre.py` and that what the error you becom is about – LinPy Nov 04 '19 at 07:22
  • I edited the dockerfile as you siad: ENTRYPOINT ["python","pre.py"] But now there is no output when I run the docker image docker run -d preprocess:v5 https://storage.googleapis.com/MyBucket/app/Model/iris.csv As you see in my python code , even the print statements which I have used for logs are not printing – harish kumaar Nov 04 '19 at 08:31
  • that is an other problem see this: https://stackoverflow.com/questions/29663459/python-app-does-not-print-anything-when-running-detached-in-docker – LinPy Nov 04 '19 at 08:34
  • I have given ENTRYPOINT ["python","-u","pre.py"] as you said but still I cant no output.. also I can see logs as the container is stopped – harish kumaar Nov 04 '19 at 09:01
  • I ran the below: docker run --name=prepy -it preprocess:v6 https://storage.googleapis.com/MyBucket/app/Model/iris.csv docker logs prepy I dont get any logs – harish kumaar Nov 04 '19 at 09:09
  • you may start your script locally without docker to see what comes at first – LinPy Nov 04 '19 at 10:00