3

I'm building an app that spawns Jobs (batch/v1), I need to update my Custom Resource status with the Job status.

I setup the controller with the following:

func (r *JobsManagedByRequestedBackupActionObserver) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&riotkitorgv1alpha1.RequestedBackupAction{}).
        Owns(&batchv1.Job{}).
        Owns(&batchv1.CronJob{}).
        WithEventFilter(predicate.Funcs{
            DeleteFunc: func(e event.DeleteEvent) bool {
                return false
            },
        }).
        Complete(r)
}

During the Reconcile(ctx context.Context, req ctrl.Request) I fetch my RequestedBackupAction object (basing on "req") and then I fetch Jobs from API using a dedicated tracking label.

list, err := kj.client.Jobs(namespace).List(ctx, metav1.ListOptions{LabelSelector: v1alpha1.LabelTrackingId + "=" + trackingId})

When I iterate over objects with:

for _, job := range list.Items {
        logrus.Errorf("[++++++++++++] JOB name=%s, failed=%v, active=%v, succeeded=%v", job.Name, job.Status.Failed, job.Status.Active, job.Status.Succeeded)
}

Then I get multiple entries like this:

time="2022-12-12T20:00:55Z" level=error msg="[++++++++++++] JOB name=app1-backup-vmqrp, failed=0, active=1, succeeded=0"

But I don't finally get an entry, where there should be: failed=1, active=0, succeeded=0 even if the Job actually finished - the point is that the controller is not being notified.

That's the final Job status:

  status:
    conditions:
    - lastProbeTime: "2022-12-12T20:00:56Z"
      lastTransitionTime: "2022-12-12T20:00:56Z"
      message: Job has reached the specified backoff limit
      reason: BackoffLimitExceeded
      status: "True"
      type: Failed
    failed: 1
    ready: 0
    startTime: "2022-12-12T20:00:50Z"
    uncountedTerminatedPods: {}

What could be wrong?

2 Answers2

1

The solution was really dead simple - when the object is not ready, then requeue it, wich for Job means to wait until it will be finished. Still I don't understand why the controller is not notified about a state change from: active=1 to active=0 and from failed=0 to failed=1

Example:

if healthStatus.Running {
    return ctrl.Result{Requeue: true}, nil
}
0

Did you set the controller reference of your owned resources before creating?

The function SetControllerReference populates metadata.ownerReferences field of your owner resource with the parent resource's reference. Without setting owner reference, parent resource cannot be triggered by owned resource's change.

// instance is your custom resource
// job is the resource that supposed to be owned by instance

err = ctrl.SetControllerReference(instance, job, r.Scheme)
if err != nil {
    return err
}

err = r.Create(ctx, job)
if err != nil {
    return err
}
tuna
  • 136
  • 7