I am working on a parallel tree implementation. During profiling, I have recognized high bad speculation on movb
operation.
My data structure looks as follows:
struct data {
enum annotation_type : std::uint8_t { none, core, region, tree_node }; // Specifies the value in the union
access_pattern _access_pattern = writing;
enum priority _priority = priority::normal;
annotation_type _annotation_type = annotation_type::none;
union {
std::uint16_t _core;
std::uint8_t _region;
std::pair<node*, std::uint16_t> _tree_node = {nullptr,0};
};
};
During traversal of the tree, I will update the _annotation_type
and the value of the union
regularly using the following function:
void annotate(const node* node, const std::uint16_t size) noexcept
{
_annotation_type = annotation_type::tree_node;
_tree_node = {node, size};
}
By clang-10
, the annotate
method is compiled to
movb $0x3, 0x12(%r14) // Set _annotation_type
movq %rax, 0x14(%r14) // Set the pointer part of tree_node
movw $0x200, 0x1c(%r14) // Set the size part of tree_node, in this case it is called with annotate(node*, 512)
Using Intel VTune Profiler, I can detemine a CPI Rate of 4.550
and a bad speculation of 53.0%
for the movb
instruction movb $0x3, 0x12(%r14)
which updates the _annotation_type
attribute to 0x3
(tree_node
).
But how can a mov
instruction get such a high bad speculation, which is based on branch misspredition, isn't it?