2

I've got a workflow in github actions that automatically creates build artifacts and updates a single release with these new build artifacts every time I merge a PR into main (here's the repo).

I want to know if a new PR will cause a change in the build artifacts (specifically, there's just one CSV file that I care about). Sometimes these changes will be intentional, sometimes not, so I want something like a git diff between the CSV file before the PR and the CSV file after the PR.

I know I could setup a github action to:

  1. checkout the old version of the code.
  2. Run the code to generate the build artifacts
  3. save the files of interest to disc
  4. checkout the proposed version of the code from the PR
  5. Run the PR code to generate the build artifacts
  6. git diff the version before the PR to the version after the PR.
  7. Format and write the git diff output as a comment to the PR, letting me know about what changes there were so I can check that everything's ok manually.

But this seems like a really common problem and I can't believe there's not a simple tool/solution out there already? Maybe some github action where you give it two SHAs, a command to run, and a list of files to git diff.

To be clear, these are build artifacts, so aren't tracked by git, and so solutions like git diff pullrequest main -- myfile.csv won't work.

torek
  • 448,244
  • 59
  • 642
  • 775
beyarkay
  • 513
  • 3
  • 16
  • 1
    I'm facing the same issue. A side thought: running the code on `main` (or whatever branch your PR points to) to generate the artifacts and then on the HEAD of the branch isn't DRY. So one approach could be to use `git notes` to keep track of the list of artifacts and their respective SHAs, ready for the next PR. – ebosi Jan 20 '23 at 20:37
  • TIL about git notes. But wrt the question, I ended up abandoning the attempt because I couldn't find an easy way around it and didn't want to sink a lot of time into the project. – beyarkay Jan 21 '23 at 09:16
  • 1
    Well, I had a bit of spare time (: – ebosi Jan 25 '23 at 10:44

1 Answers1

2

Here is a solution that leverages git notes:

enter image description here

(In a nutshell, git notes allow you to CRUD metadata to a commit without touching the commit itself — and thus preserving history. Cf. § References below.)

Essentially, we want our workflow to:

  1. Build the artefacts
    We emulate this by running make build — to be adapted to your own scenario. For the sake of the example, we also assume that the build/ directory contains all and only the artefacts generated.
  2. “Remember” the artefacts and their content (a so-called “artefacts summary”)
    We use the sha512sum shell command to create a mapping of artefacts' content (represented through their SHA sum) to their file name.
    We retrieve all artefacts via find results/ -type f, and then convert the mapping to a CSV with headers using sed 's/ /,/' | cat <(echo 'sha512,file_name') -
  3. Attach the artefacts summary to the commit
    That's where we leverage git notes, which allows us to add metadata to the commit ex-post, without modifying the history.

These steps should be executed for any commit on your main branch.

In case of a PR, you also want to repeat these steps on the branch's HEAD, plus:

  1. Retrieve the artefacts summary of your PR's target branch
    So you now have two artefacts summaries to compare: base (your main/master branch's one) and head (the branch of your PR). In the example below, the base is hard coded to main, but you could refine this by letting the workflow retrieve the target branch's name automatically.
  2. Compare both artefacts summaries
    I've created the artefactscomparison Python package for that purpose. (Note: it's very much tailored to my use case and desiderata.)
  3. Add the artefact comparison report to your PR
    Beebop, a bot will do that for you.

In the end, you should see something like on the screenshot above.

name: Artefacts Comparison

on:
  push:
    branches:
      - main

  pull_request:
    branches:

permissions: write-all

jobs:
  build_artefacts:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0
          token: ${{ github.token }}
      - name: Build artefacts
        run: make build
      - name: Generate artefacts summary
        id: artefacts-summary
        run: |
          echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
          find build/ -type f -exec sha512sum {} \; | sed 's/  /,/' | cat <(echo 'sha512,file_name') - >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT
      - name: Add the artefacts summary as a git notes
        run: |
          git fetch origin refs/notes/*:refs/notes/*
          git config user.name "github-actions"
          git config user.email "bot@github.com"
          git notes add -m "${{ steps.artefacts-summary.outputs.ARTEFACTS_SUMMARY }}"
          git notes show
          git push origin refs/notes/*
  # In case of PR, add report of artefacts comparison
  compare_artefacts:
    runs-on: ubuntu-latest
    if: ${{ github.event_name == 'pull_request' }}
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0
          token: ${{ github.token }}
      - name: Pull artefacts summaries (i.e., git notes) from upstream
        run: |
          git fetch origin refs/notes/*:refs/notes/*
      - name: Retrieve PR's head branch's artefacts summary
        id: artefact-summary-head
        run: |
          echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
          git notes show >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT
      - name: Retrieve PR's target branch's artefacts summary
        id: artefact-summary-base
        run: |
          git checkout ${{ github.base_ref }}
          echo "ARTEFACTS_SUMMARY<<EOF" >> $GITHUB_OUTPUT
          git notes show >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT
      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.10"
      - name: Install artefactscomparison package
        run: pip install -U artefactscomparison
      - name: Generate artefact comparison report
        id: artefact-comparison-report
        run: |
          echo "${{ steps.artefact-summary-head.outputs.ARTEFACTS_SUMMARY }}" > head.csv
          echo "${{ steps.artefact-summary-base.outputs.ARTEFACTS_SUMMARY }}" > base.csv
          echo "ARTEFACTS_REPORT<<EOF" >> $GITHUB_OUTPUT
          artefacts_comparison -b base.csv -h head.csv >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT
      - name: Comment PR with artefact comparison report
        uses: thollander/actions-comment-pull-request@v2
        with:
          message: ${{ steps.artefact-comparison-report.outputs.ARTEFACTS_REPORT }}
          comment_tag: artefact_comparison_report
          mode: recreate
    needs: build_artefacts

References:

ebosi
  • 1,285
  • 5
  • 17
  • 37