This has nothing to do with C++ (or any other language) and everything to do with the underlying processor architecture/implementation. Division is generally on the order of 10x slower than multiplication on most processor families.
That said, the latency of a divide instruction is rarely the main performance hotspot in a program these days (your mileage may vary, of course). The place where it makes most sense to replace division with multiplication by the reciprocal is when you're dividing a vector by a constant factor (e.g. vector normalization, scaling, matrix pivoting). This is also the place where pipeline depth will do the most to mask the div
instruction latency and memory bandwidth (assuming large vectors) will likely become the throughput limitation.
In short, it's rarely worthwhile to engage in these types of micro-optimizations these days. Most likely the compiler optimizer will recognize cases where it makes sense to substitute multiplication for division anyways.