The reason we're forced to provide definitions of inline function in header files (or at least, in some form that is visible to the implementation when inlining a function in a given compilation unit) is requirements of the C++ standard.
However, the standard does not go out of its way to prevent implementations (e.g. the toolchain or parts of it, such as the preprocessor, compiler proper, linker, etc) from doing things a little smarter.
Some particular implementations do things a little smarter, so can actually inline functions even in circumstances where they are not visible to the compiler. For example, in a basic "compile all the source files then link" toolchain, a smart linker may realise that a function is small and only called a few times, and elect to (in effect) inline it, even if the points where inlining occurs were not visible to the compiler (e.g. because the statements that called the functions were in separate compilation units, the function itself is in another compilation unit) so the compiler would not do inlining.
The thing is, the standard does not prevent an implementation from doing that. It simply states the minimum set of requirements for behaviour of ALL implementations.
Essentially, the requirement that the compiler have visibility of a function to be inlined is the minimum requirement from the standard. If a program is written in that way (e.g. all functions to be inlined are defined in their header file) then the standard guarantees that it will work with every (standard compliant) implementation.
But what does this mean for our smarter tool-chain? The smarter tool-chain must produce correct results from a program that is well-formed - including one that defines inlined functions in every compilation unit which uses those functions. Our toolchain is permitted to do things smarter (e.g. peeking between compilation units) but, if code is written in a way that REQUIRES such smarter behaviour (e.g. that a compiler peek between compilation units) that code may be rejected by another toolchain.
In the end, every C++ implementation (the toolchain, standard library, etc) is required to comply with requirements of the C++ standard. The reverse is not true - one implementation may do things smarter than the standard requires, but that doesn't generate a requirement that some other implementation do things in a compatible way.
Technically, inlining is not limited to being a function of the compiler. It may happen in the compiler or the linker. It may also happen at run time - for example "Just In Time" technology can, in effect, restructure executable code after it has been run a few times in order to enhance subsequent performance [this typically occurs in a virtual machine environment, which permits the benefits of such techniques while avoiding problems associated with self-modifying executables].