Where should I store PGO database files?

Question

I have been looking into using Profile Guided Optimisation in Visual C++ 2013. I'm happy with executing the training set using different scenarios as a manual step, but would like the final optimised build and link to work on our CI build server.

With that in mind, where is the best place for me to store the PGO profile databases? Storing them into version control (Git in our case) is the most convenient place, but I'm aware that they are binary files in the order of tens, potentially even hundreds, of megabytes and these won't necessarily store well in a source control system.

Alternatively, is there a better solution or best practices for integrating PGO into our automated builds?

score 0 · Answer 1 · answered Nov 16 '17 at 12:00

You have multiple choices here, you can pick one of them with considering the tradeoffs.

Do you use PGO for whole code base, or for some hotspots or modules? How often do you refresh the databases.
Is that a huge problem to release sth, with inappropriate PGO databases.
How big is your project, and how big it can be in the worst case
Is it acceptable to store PGO databases in source control in this worst case.

Through your answers you can create a route to yourself.

If you dont refresh the PGO databases too often and if the binary sizes wont ruin your project, you can store them within version control.

If you regenerate them for each commit, you can put them a separate repository which includes, real code bases commit id(on which commit PGO profile databases created) as a commit message

Or if you dont generate them too often, and when you return back to a specific commit, its ok to regenerate the databases you may not store them, just put into CI build machine.

Or you can make another combination around this opinions :)

score 0 · Accepted Answer · answered Jun 18 '18 at 05:00

Our solution was to store the PGO files using Git LFS (see also this question).

The advantages of this approach:

PGO databases live in the repository alongside the code version they are associated with
The databases are fully and seamlessly available to the development, build and test machines without configuring external file storage
The large binary files aren't stored in the normal Git repository so won't bloat it or make merges tedious.

The only mild complication of this approach, which didn't significantly affect us, is that all machines and software must support and install Git LFS, from development workstations to CI servers.

Where should I store PGO database files?

2 Answers2