3

Given that:

  • There seems to be no easy way to get a list of "changed" files in Jenkins (see here and here)
  • There seems to be no fast way to get a list of files changed since label xxxx

How can I go about optimising our build so that when we run PMD it only runs against files that have been modified since the last green build.

Backing up a bit… our PMD takes 3–4 minutes to run against ~1.5 million lines of code, and if it finds a problem the report invariably runs out of memory before it completes. I'd love to trim a couple of minutes off of our build time and get a good report on failures. My original approach was that I'd:

  • get the list of changes from Jenkins
  • run PMD against a union of that list and the contents of pmd_failures.txt
  • if PMD fails, include a list of failing files in pmd_failures.txt

More complicated than I'd like, but worth having a build that is faster but still reliable.

Once I realised that Jenkins was not going to easily give me what I wanted, I realised that there was another possible approach. We label every green build. I could simply get the list of files changed since the label and then I could do away with the pmd_failures.txt entirely.

No dice. The idea of getting a list of files changed since label xxxx from Perforce seems to have never been streamlined from:


    $ p4 files //path/to/branch/...@label > label.out
    $ p4 files //path/to/branch/...@now > now.out
    $ diff label.out now.out

Annoying, but more importantly even slower for our many thousands of files than simply running PMD.

So now I'm looking into trying to run PMD in parallel with other build stuff, which is still wasted time and resources and makes our build more complex. It seems to me daft that I can't easily get a list of changed files from Jenkins or from Perforce. Has anyone else found reasonable workaround for these problems?

Community
  • 1
  • 1

3 Answers3

4

I think I've found the answer, and I'll mark my answer as correct if it works.

It's a bit more complex than I'd like, but I think it's worth the 3-4 minutes saved (and potential memory issues).

  1. At the end of a good build, save the good changelist as a Perforce counter. (post-build task). Looks like this:
    $ p4 counter last_green_trunk_cl %P4_CHANGELIST%
    
  2. When running PMD, read the counter into the property last.green.cl and get the list of files from:
    $ p4 files //path/to/my/branch/...@${last.green.cl},now
    //path/to/my/branch/myfile.txt#123 - edit change 123456 (text)
    //path/to/my/branch/myotherfile.txt#123 - add change 123457 (text)
    etc...
    (have to parse the output)
    
  3. Run PMD against those files.

That way we don't need the pmd_failures.txt and we only run PMD against files that have changed since the last green build.

[EDIT: changed it to use p4 counter, which is way faster than checking in a file. Also, this was very successful so I will mark it as answered]

1

I'm not 100% sure since I've never use Perforce with Jenkins, but I believe Perforce passes the changelist number through the environment variable $P4_CHANGELIST. With that, you can run the p4 filelog -c $P4_CHANGELIST which should give you the files from that particular changelist. From there, it shouldn't be hard to script something up to just get the changed files (plus the old failures into PMD).

I haven't use Perforce in a long time, but I believe the -Ztag parameter makes it easier to parse P4 output for the various scripting languages.

David W.
  • 105,218
  • 39
  • 216
  • 337
1

Have you thought about using automatic labels? They're basically just an alias for a changelist number, so it's easier to get the set of files that differ between two automatic labels.

randy-wandisco
  • 3,649
  • 16
  • 11