Python recursive tree example

Question

I'm analying a bit of code in one of the Jenkins builds which uses recursion to get jenkins' downstream jobs' urls

def get_all_downstream_jobs_urls(ds_jobs: set = None):
    global JENKINS_JOBS

    if not ds_jobs: 
         ds_jobs = set(); ds_jobs.update(extract_ds_job_url(get_ds_jobs(BASE_JOB_URL)))

    temp = ds_jobs 

    for _ in ds_jobs.copy():
        result = extract_ds_job_url(get_ds_jobs(_)) # <--- jenkins rest api call
        if result: temp.update(result); JENKINS_JOBS.update(temp);
        else: return temp

    return get_all_downstream_jobs_urls(temp)

This works OK for a project which has downstream jobs although it's making too many calls to Jenkins rest api, but if a project doesn't have downstream jobs it is stuck in recursion. Could you help me figure out where the issue is?

I don't see any recursion here, unless `extract_ds_job_url()` or `get_ds_jobs()` make recursive calls. There is too much missing here to be able to help at all. — Martijn Pieters, Jul 13 '18 at 09:38
Also, `ds_jobs = set(); ds_jobs.update(...)` should just be `ds_jobs = set(...)`. The `global JENKINS_JOBS` line is redundant (nothing ever rebinds the name in the function). Don't use `_` for a variable name if you then actually use the value, `_` signals you intent to ignore the value. If it is a URL, name the variable `url`. `temp` is not a great name either, and you'd use `temp = ds_jobs.copy()` rather than create the copy for the loop. — Martijn Pieters, Jul 13 '18 at 09:42

Martijn Pieters · Accepted Answer · 2018-07-13T11:53:23.300

If extract_ds_job_url(get_ds_jobs(BASE_JOB_URL)) returns an empty set, you are forever calling get_all_downstream_jobs_urls(temp). That's because the for loop is not going to do anything.

The test at the top should check for None instead:

if ds_jobs is None:

and a separate test for ds_jobs being empty should end the recursion:

if not ds_jobs:
    # no downstream jobs to process
    return set()

I can't vouch for the rest of the logic, but there are certainly many style errors in the code. I'd refactor it to at least get rid of some of those errors:

JENKINS_JOBS is never rebound, so global JENKINS_JOBS is redundant and confusing and should be removed.
It's not clear why the function is updating a global and is returning the result set. It should do one, or the other, not both.
_ is, by convention, a throw-away variable. It signals that the value is not going to be used. Yet here the code does use it. It should be renamed job_url instead.
You really never should use ; in production code. Put the code on separate lines.
ds_jobs = set() then ds_jobs.update(...) is a way too verbose spelling of ds_jobs = set(...).
temp is not a good variable name, updated might be a better name. It should be made a copy when assigned, so updated = set(ds_jobs), and the .copy() call can be removed from the for loop.
the return when the first job URL doesn't have downstream URLs is probably not what you want either.
If you really want a tree of downstream URLs, the recursive call should not try to pass in all job urls collected so far! It's just as likely to call the jenkins API again and again for a job URL that was already checked.

The following code removes the recursion by using a stack instead, and is guaranteed to call the Jenkins API for each job URL just once:

def get_all_downstream_jobs_urls():
    ds_jobs = set()
    stack = [extract_ds_job_url(get_ds_jobs(BASE_JOB_URL))]
    while stack:
        job_url = stack.pop()
        if job_url in ds_jobs:
            # already seen before, skip
            continue
        ds_jobs.add(job_url)
        # add downstream jobs to the stack for further processing
        stack.extend(extract_ds_job_url(get_ds_jobs(job_url)))
    return ds_jobs

Last but not least, I strongly suspect that using a third-party library like the jenkinsapi package would make this all even simpler; the Jenkins API probably lets you query this information in just one call but a library probably does such a call for you and give you readily-parsed Python objects for the information.

Great explanation and alternative code sample. thanks – testing_kate Jul 13 '18 at 11:39 — testing_kate, Jul 13 '18 at 11:39

Python recursive tree example

1 Answers1