I initialized kubeflow pods using the following command.
juju deploy kubeflow
The following two pods didn't run and gave an error message stating IMAGEPULLBACKOFF.
- kfp-viz,
- kfp-profile-controller
Yaml code for kfp-viz
Name: kfp-viz-65bc89cd9b-dnng9
Namespace: kubeflow
Priority: 0
Node: mlops/10.50.60.90
Start Time: Mon, 30 May 2022 05:22:25 +0000
Labels: app.kubernetes.io/name=kfp-viz
pod-template-hash=65bc89cd9b
Annotations: apparmor.security.beta.kubernetes.io/pod: runtime/default
charm.juju.is/modified-version: 0
cni.projectcalico.org/podIP: 10.1.190.114/32
cni.projectcalico.org/podIPs: 10.1.190.114/32
controller.juju.is/id: ae0e16a7-f10b-41e9-8f64-35346b6c91dd
model.juju.is/id: a6f5b73a-cb38-42cd-8f8e-36aced0c1bbd
seccomp.security.beta.kubernetes.io/pod: docker/default
unit.juju.is/id: kfp-viz/0
Status: Pending
IP: 10.1.190.114
IPs:
IP: 10.1.190.114
Controlled By: ReplicaSet/kfp-viz-65bc89cd9b
Init Containers:
juju-pod-init:
Container ID: containerd://ffab869ab2beeab6a6bfde53be1175da58044c35da4d0bc2b66db231c585b142
Image: jujusolutions/jujud-operator:2.9.29
Image ID: sha256:47127013daad1c7215de0566f312d2a00eb83b3ef898e7b07f1ceb9e42860b4a
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
export JUJU_DATA_DIR=/var/lib/juju
export JUJU_TOOLS_DIR=$JUJU_DATA_DIR/tools
mkdir -p $JUJU_TOOLS_DIR
cp /opt/jujud $JUJU_TOOLS_DIR/jujud
initCmd=$($JUJU_TOOLS_DIR/jujud help commands | grep caas-unit-init)
if test -n "$initCmd"; then
$JUJU_TOOLS_DIR/jujud caas-unit-init --debug --wait;
else
exit 0
fi
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 30 May 2022 05:22:35 +0000
Finished: Mon, 30 May 2022 05:24:10 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-82xc9 (ro)
Containers:
ml-pipeline-visualizationserver:
Container ID:
Image: registry.jujucharms.com/charm/c2o31yht1y825t6n49mwko4wyel0rracnrjn5/oci-image@sha256:13c46cf878062fd6ad672cbec4854eba7e869cd0123a8975bea49b9d75d4e698
Image ID:
Port: 8888/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Liveness: exec [wget -q -S -O - http://localhost:8888/] delay=3s timeout=2s period=5s #success=1 #failure=3
Readiness: exec [wget -q -S -O - http://localhost:8888/] delay=3s timeout=2s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/usr/bin/juju-run from juju-data-dir (rw,path="tools/jujud")
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-82xc9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
juju-data-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-82xc9:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/arch=amd64
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 10m (x338 over 35h) kubelet (combined from similar events): Failed to pull image "registry.jujucharms.com/charm/c2o31yht1y825t6n49mwko4wyel0rracnrjn5/oci-image@sha256:13c46cf878062fd6ad672cbec4854eba7e869cd0123a8975bea49b9d75d4e698": rpc error: code = FailedPrecondition desc = failed to pull and unpack image "registry.jujucharms.com/charm/c2o31yht1y825t6n49mwko4wyel0rracnrjn5/oci-image@sha256:13c46cf878062fd6ad672cbec4854eba7e869cd0123a8975bea49b9d75d4e698": failed commit on ref "layer-sha256:d8f1984ce468ddfcc0f2752e09c8bbb5ea8513d55d5dd5f911d2d3dd135e5a84": "layer-sha256:d8f1984ce468ddfcc0f2752e09c8bbb5ea8513d55d5dd5f911d2d3dd135e5a84" failed size validation: 19367144 != 32561845: failed precondition
Normal BackOff 10s (x4271 over 36h) kubelet Back-off pulling image "registry.jujucharms.com/charm/c2o31yht1y825t6n49mwko4wyel0rracnrjn5/oci-image@sha256:13c46cf878062fd6ad672cbec4854eba7e869cd0123a8975bea49b9d75d4e698"
Yaml code for kfp-profile-controller
Name: kfp-profile-controller-operator-0
Namespace: kubeflow
Priority: 0
Node: mlops/10.50.60.90
Start Time: Mon, 30 May 2022 05:15:50 +0000
Labels: controller-revision-hash=kfp-profile-controller-operator-755985f4fc
operator.juju.is/name=kfp-profile-controller
operator.juju.is/target=application
statefulset.kubernetes.io/pod-name=kfp-profile-controller-operator-0
Annotations: apparmor.security.beta.kubernetes.io/pod: runtime/default
cni.projectcalico.org/podIP: 10.1.190.93/32
cni.projectcalico.org/podIPs: 10.1.190.93/32
controller.juju.is/id: ae0e16a7-f10b-41e9-8f64-35346b6c91dd
juju.is/version: 2.9.29
model.juju.is/id: a6f5b73a-cb38-42cd-8f8e-36aced0c1bbd
seccomp.security.beta.kubernetes.io/pod: docker/default
Status: Running
IP: 10.1.190.93
IPs:
IP: 10.1.190.93
Controlled By: StatefulSet/kfp-profile-controller-operator
Containers:
juju-operator:
Container ID: containerd://a4cc8356c4ce7d8bbb9282ccee17c38937a7a3a3bfa13caa29eaeccb63a86d30
Image: jujusolutions/jujud-operator:2.9.29
Image ID: sha256:47127013daad1c7215de0566f312d2a00eb83b3ef898e7b07f1ceb9e42860b4a
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
export JUJU_DATA_DIR=/var/lib/juju
export JUJU_TOOLS_DIR=$JUJU_DATA_DIR/tools
mkdir -p $JUJU_TOOLS_DIR
cp /opt/jujud $JUJU_TOOLS_DIR/jujud
$JUJU_TOOLS_DIR/jujud caasoperator --application-name=kfp-profile-controller --debug
State: Running
Started: Mon, 30 May 2022 05:16:05 +0000
Ready: True
Restart Count: 0
Environment:
JUJU_APPLICATION: kfp-profile-controller
JUJU_OPERATOR_SERVICE_IP: 10.152.183.68
JUJU_OPERATOR_POD_IP: (v1:status.podIP)
JUJU_OPERATOR_NAMESPACE: kubeflow (v1:metadata.namespace)
Mounts:
/var/lib/juju/agents from charm (rw)
/var/lib/juju/agents/application-kfp-profile-controller/operator.yaml from kfp-profile-controller-operator-config (rw,path="operator.yaml")
/var/lib/juju/agents/application-kfp-profile-controller/template-agent.conf from kfp-profile-controller-operator-config (rw,path="template-agent.conf")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6x9qh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
charm:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: charm-kfp-profile-controller-operator-0
ReadOnly: false
kfp-profile-controller-operator-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kfp-profile-controller-operator-config
Optional: false
kube-api-access-6x9qh:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
Name: kfp-profile-controller-68f554c765-7vvmj
Namespace: kubeflow
Priority: 0
Node: mlops/10.50.60.90
Start Time: Mon, 30 May 2022 05:34:26 +0000
Labels: app.kubernetes.io/name=kfp-profile-controller
pod-template-hash=68f554c765
Annotations: apparmor.security.beta.kubernetes.io/pod: runtime/default
charm.juju.is/modified-version: 0
cni.projectcalico.org/podIP: 10.1.190.134/32
cni.projectcalico.org/podIPs: 10.1.190.134/32
controller.juju.is/id: ae0e16a7-f10b-41e9-8f64-35346b6c91dd
model.juju.is/id: a6f5b73a-cb38-42cd-8f8e-36aced0c1bbd
seccomp.security.beta.kubernetes.io/pod: docker/default
unit.juju.is/id: kfp-profile-controller/0
Status: Pending
IP: 10.1.190.134
IPs:
IP: 10.1.190.134
Controlled By: ReplicaSet/kfp-profile-controller-68f554c765
Init Containers:
juju-pod-init:
Container ID: containerd://7a1f2d07bce6f58bd7e9899e0f26a2cd36abfacad55d0e6b43fbb8fcb93df289
Image: jujusolutions/jujud-operator:2.9.29
Image ID: sha256:47127013daad1c7215de0566f312d2a00eb83b3ef898e7b07f1ceb9e42860b4a
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
export JUJU_DATA_DIR=/var/lib/juju
export JUJU_TOOLS_DIR=$JUJU_DATA_DIR/tools
mkdir -p $JUJU_TOOLS_DIR
cp /opt/jujud $JUJU_TOOLS_DIR/jujud
initCmd=$($JUJU_TOOLS_DIR/jujud help commands | grep caas-unit-init)
if test -n "$initCmd"; then
$JUJU_TOOLS_DIR/jujud caas-unit-init --debug --wait;
else
exit 0
fi
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 30 May 2022 05:34:32 +0000
Finished: Mon, 30 May 2022 05:35:50 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lqf7h (ro)
Containers:
kubeflow-pipelines-profile-controller:
Container ID:
Image: registry.jujucharms.com/charm/gm1axzm8pxqlan75l3a7znu2mv5bf0pm1wfar/oci-image@sha256:14ec52252771f8fa904afbdac497c80fc3234d518b1e0bced0c810d5748a7347
Image ID:
Port: 80/TCP
Host Port: 0/TCP
Command:
python
Args:
/hooks/sync.py
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment:
CONTROLLER_PORT: 80
DISABLE_ISTIO_SIDECAR: false
KFP_DEFAULT_PIPELINE_ROOT:
KFP_VERSION: 1.7.0-rc.3
METADATA_GRPC_SERVICE_HOST: mlmd.kubeflow
METADATA_GRPC_SERVICE_PORT: 8080
MINIO_ACCESS_KEY: minio
MINIO_HOST: minio
MINIO_NAMESPACE: kubeflow
MINIO_PORT: 9000
MINIO_SECRET_KEY: SV25N7GCN6HAYV19M4GMHGX0YTZ840
Mounts:
/hooks from kubeflow-pipelines-profile-controller-code (rw)
/usr/bin/juju-run from juju-data-dir (rw,path="tools/jujud")
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lqf7h (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
juju-data-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kubeflow-pipelines-profile-controller-code:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kubeflow-pipelines-profile-controller-code
Optional: false
kube-api-access-lqf7h:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/arch=amd64
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 28m (x4631 over 38h) kubelet Back-off pulling image "registry.jujucharms.com/charm/gm1axzm8pxqlan75l3a7znu2mv5bf0pm1wfar/oci-image@sha256:14ec52252771f8fa904afbdac497c80fc3234d518b1e0bced0c810d5748a7347"
Warning Failed 11m (x207 over 38h) kubelet Error: ErrImagePull
Warning Failed 5m51s (x497 over 37h) kubelet (combined from similar events): Failed to pull image "registry.jujucharms.com/charm/gm1axzm8pxqlan75l3a7znu2mv5bf0pm1wfar/oci-image@sha256:14ec52252771f8fa904afbdac497c80fc3234d518b1e0bced0c810d5748a7347": rpc error: code = FailedPrecondition desc = failed to pull and unpack image "registry.jujucharms.com/charm/gm1axzm8pxqlan75l3a7znu2mv5bf0pm1wfar/oci-image@sha256:14ec52252771f8fa904afbdac497c80fc3234d518b1e0bced0c810d5748a7347": failed commit on ref "layer-sha256:3bfc3875e0f70f1fb305c87b4bc4d886f1118ddfedfded03ef0cb1c394cb90f0": "layer-sha256:3bfc3875e0f70f1fb305c87b4bc4d886f1118ddfedfded03ef0cb1c394cb90f0" failed size validation: 20672632 != 53354193: failed precondition
- Kubeflow version:1.21:
- kfctl version: kfctl v1.0.1-0-gf3edb9b:
- Kubernetes platform: Microk8s
- Kubernetes version: Client-GitVersion:"v1.24.0" , Server-GitVersion:"v1.21.12-3+6937f71915b56b":
- OS : Linux 18.04