Discussion on author's idea #1
The most-promising idea I had was to (1) find a way to enumerate all of the .o
files upon which foo-use
depenends, and then (2) iterate over each of those .o
files, calling add_dependency
for each one.
This shouldn't work according to the documentation for add_dependencies
, which states:
Makes a top-level depend on other top-level targets to ensure that they build before does.
Ie. You can't use it to make a target depend on files- only on other targets.
Discussion on author's idea #2
I also considered using set_source_files_properties to set the OBJECT_DEPENDS
property on each of my .cpp
files used by foo-use
, adding prof.data
to that property's list.
The problem with this (AFAICT) is that each of my .cpp
files is used to create two different .o
files: one for foo-gen
and one for foo-use
. I want the .o
files that get linked into foo-use
to have this compile-time dependency on prof.data
; but the .o
files that get linked into foo-gen
must not have a compile-time dependency on prof.data
.
And AFAIK, set_source_files_properties
doesn't let me set the OBJECT_DEPENDS
property to have one of two values, contingent on whether foo-gen
or foo-use
is the current target of interest.
In the comment section, you mentioned that you could solve this if OBJECT_DEPENDS
supported generator expressions, but it doesn't. As a side note, there is an issue ticket tracking this on the CMake gitlab repo. You can go give it a thumbs up and describe your use-case for their reference.
In the comments section you also mentioned a possible solution to this:
Potential other solution a) double project system where main user invoked project forwards settings to second pgo project compiling same settings again.
You can actually put this into the CMake project via ExternalProject
so that it becomes part of the generated buildsystem: Make the top-level project include itself as an external project. The external project can be passed a cache variable to configure it to be the -gen
version, and the top-level can be the -use
version.
Speaking from experience, this is a whole other rabbit hole of long CMake-documentation-reading and finicking sessions if you have never manually invoked or done anything with ExternalProject
before, so that answer might belong with a new question dedicated to it.
This can solve the problem of not having generator expressions in OBJECT_DEPENDS
, but if you want to have multi-config for the top-level project and that some of the configs in the multi-config config not be for PGO, then you will be back to square one.
Proposed Solution
Here's what I've found works to make sources re-compile when profile data changes:
- To the custom command which runs the training executable and produces and re-formats the training data, add another
COMMAND
which produces a c++ header file containing a timestamp in a comment.
- Include that header in all sources which you want to re-compile if the training has been re-run.
If you want to support non-PGO builds, wrap the timestamp header in a header which checks that it exists with __has_include
and only includes it if it exists.
I'm pretty sure with this approach, CMake doesn't do the dependency checking of TUs on the profile data, and instead, it's the generated buildsystem's header-dependency tracking which does that work. The rationale for including a timestamp comment in the header file instead of just "touch"ing it to change the timestamp in the filesystem is that the generated buildsystem might detect changes by file contents instead of by the filesystem timestamp.
All the shortcomings of the proposed solution
The painfully obvious weakness of this approach is that you need to add a header include to all the .cpp files that you want to check for re-compilation. There are several problems that can spawn from this (from least to most egregious):
You might not like it from an aesthetics standpoint.
It certainly opens up a hole for human-error in forgetting to include the header for new .cpp files. I don't know how to solve that. Some compilers have a flag that you can use to include a file in every source file, such as GCC's -include
flag and MSVC's /FI
flag. You can then just add this flag to a CMake target using target_compile_options(<target> PRIVATE "SHELL:-include <path>")
You might not be able to change some of the sources that you need to re-compile, such as sources from third-party static libraries that your library depends on. There may be workarounds if you're using ExternalProject
by doing something with the patch
step, but I don't know.
For my personal project, #1 and #2 are acceptable, and #3 happens to not be an issue. You can take a look at how I'm doing things there if you're interested.
Toward a standard PGO CMake module
See https://gitlab.kitware.com/cmake/cmake/-/issues/19273