Kubeflow Training Operator provides Kubernetes custom resources that makes it easy to run distributed or non-distributed TensorFlow/PyTorch/Apache MXNet/XGBoost/MPI jobs on Kubernetes.
Questions tagged [kubeflow]
433 questions
23
votes
9 answers
Sudden ImportError: cannot import name 'appengine' from 'requests.packages.urllib3.contrib error on pipeline
My pipelines and schedulers were running smoothly without any problems. After I went out to lunch, I changed the number of epochs a Neural Network would run, save the .yaml file again and leave it in the bucket named "budgetff".
Afterwards,…

filipe
- 275
- 1
- 1
- 8
18
votes
7 answers
Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled
While running kubeflow pipeline having code that uses tensorflow 2.0. below error is displayed at end of each epoch
W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled:…

Radhi
- 6,289
- 15
- 47
- 68
11
votes
1 answer
Kubeflow vs other options
I am trying to find when it makes sense to create your own Kubeflow MLOps platform:
If you are Tensorflow only shop, do you still need Kubeflow? Why not TFX only? Orchestration can be done with Airflow.
Why use Kubeflow if all you are using…

Cengiz
- 303
- 2
- 9
10
votes
3 answers
How to pass data or files between Kubeflow containerized components in python
I'm exploring Kubeflow as an option to deploy and connect various components of a typical ML pipeline. I'm using docker containers as Kubeflow components and so far I've been unable to successfully use ContainerOp.file_outputs object to pass results…

Ash
- 969
- 3
- 16
- 28
8
votes
2 answers
How to use tqdm in Kubernetes
I'm using Kubernetes, and a training job runs on the cluster.
I'm using TQDM as progress bar, but unlike what I've expected, the progress bar doesn't show up when I check Kubernetes Pod logs. Does anyone have a solution to this problem?

Piljae Chae
- 987
- 10
- 23
8
votes
2 answers
microk8s Broken K8s Dashboard and Kubeflow Dashboard
I'm using microk8s in an Ubuntu 18.04 LTS VM, 3 cores, 60 GB storage, 12 GB of memory. I followed the instructions from microk8s website here to install it.
$ snap install microk8s --classic --channel=1.18/stable
$ sudo microk8s start
$ sudo…

lwileczek
- 2,084
- 18
- 27
8
votes
1 answer
Kubeflow Pipeline Termination Notificaiton
I tried to add a logic that will send slack notification when the pipeline terminated due to some error. I tried to implement this with ExitHandler. But, seems the ExitHandler can’t dependent on any op. Do you have any good idea?

Wenmin Wu
- 1,808
- 12
- 24
8
votes
6 answers
How to get the id of the run from within a component?
I'm doing some experimentation with Kubeflow Pipelines and I'm interested in retrieving the run id to save along with some metadata about the pipeline execution. Is there any way I can do so from a component like a ContainerOp?

DSF
- 83
- 1
- 3
7
votes
3 answers
kubeflow pipeline dynamic output list as input parameter
I use a ParallelFor over a dynamic list. I want to collect all the outputs from the loop, and pass them to another ContainerOp.
Something like the following, which obviously does not work, since the outputs list is will be static.
with…

user3599803
- 6,435
- 17
- 69
- 130
7
votes
2 answers
How to escape "{{" and "}}" in argo workflow
I want to run one argo workflow in which a value is surrounded with double braces. Argo tries to resolve it but I don't want argo to resolve it.
Following is a fraction of katib studyjob workflow manifest.
workerSpec:
goTemplate:
…

shabbir
- 121
- 2
- 6
6
votes
1 answer
How do we assign pods properly so that KFServing can scale down GPU Instances to zero?
I'm setting up an InferenceService using Argo and KFServing with Amazon EKS (Kubernetes). Its important to know that our team has one EKS cluster per environment, which means there can be multiple applications within our cluster that we don't…

Daniel Hair
- 270
- 1
- 4
- 15
5
votes
1 answer
Is it possible to mix kubeflow components with tensorflow extended components?
It looks like Kubeflow has deprecated all of their TFX components. I currently have some custom Kubeflow components that help launch some of my data pipelines and I was hoping I could use some TFX components in the same kubeflow pipeline. Is there a…

sleepyowl
- 168
- 5
5
votes
2 answers
Sharing secrets in Kubeflow pipeline
I want to share some secrets with my Kubeflow pipeline so I can use them as environment variables in my containers. I've written a pipeline-secrets.yaml that looks like this:
apiVersion: v1
kind: Secret
metadata:
name: pipeline-secrets
…

João Areias
- 1,192
- 11
- 41
5
votes
1 answer
What's different between TFServing and KFServing on KubeFlow
TFServin and KFServing both deploy the model on Kubeflow, and let users easy to use the model as a service, don't need to know detail about Kubernetes, hiding the infra layers.
TFServing is from TensorFlow, it can also run on Kubeflow or…

Kevin Su
- 542
- 2
- 6
- 24
5
votes
2 answers
Aggregate results when using Kubeflow Pipelines kfp.ParallelFor
What is a good pattern for aggregating the results from Kubeflow Pipleine kfp.ParallelFor?

Jet Basrawi
- 3,185
- 2
- 15
- 14