30

I'd like my program to read the cache line size of the CPU it's running on in C++.

I know that this can't be done portably, so I will need a solution for Linux and another for Windows (Solutions for other systems could be useful to others, so post them if you know them).

For Linux I could read the content of /proc/cpuinfo and parse the line beginning with cache_alignment. Maybe there is a better way involving a call to an API.

For Windows I simply have no idea.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Mathieu Pagé
  • 10,764
  • 13
  • 48
  • 71

8 Answers8

21

On Win32, GetLogicalProcessorInformation will give you back a SYSTEM_LOGICAL_PROCESSOR_INFORMATION which contains a CACHE_DESCRIPTOR, which has the information you need.

Roger Lipscombe
  • 89,048
  • 55
  • 235
  • 380
Nick
  • 13,238
  • 17
  • 64
  • 100
7

On Linux try the proccpuinfo library, an architecture independent C API for reading /proc/cpuinfo

PiedPiper
  • 5,735
  • 1
  • 30
  • 40
5

Looks like at least SCO unix (http://uw714doc.sco.com/en/man/html.3C/sysconf.3C.html) has _SC_CACHE_LINE for sysconf. Perhaps other platforms have something similar?

Evan Teran
  • 87,561
  • 32
  • 179
  • 238
5

For x86, the CPUID instruction. A quick google search reveals some libraries for win32 and c++. I have used CPUID via inline assembler as well.

Some more info:

robottobor
  • 11,595
  • 11
  • 39
  • 37
4

On Windows

#include <Windows.h>
#include <iostream>

using std::cout; using std::endl;

int main()
{
    SYSTEM_INFO systemInfo;
    GetSystemInfo(&systemInfo);
    cout << "Page Size Is: " << systemInfo.dwPageSize;
    getchar();
}

On Linux

http://linux.die.net/man/2/getpagesize

Researcher
  • 1,006
  • 7
  • 14
  • 4
    After coming back to this I don't believe I answered your question, which was about the cache line size rather then the memory page size correct? https://en.wikipedia.org/wiki/Page_(computer_memory) I was googling for a page size snippet (working on a project involving memory access) and came here, the dangers of skimming. Please untick my answer, but probably worth leaving it here for future reference. – Researcher Sep 15 '16 at 17:33
  • Indeed, the question was mistitled with "cache page size". I fixed it. – Peter Cordes Feb 08 '23 at 19:11
3

Here is sample code for those who wonder how to to utilize the function in accepted answer:

#include <new>
#include <iostream>
#include <Windows.h>


void ShowCacheSize()
{
    using CPUInfo = SYSTEM_LOGICAL_PROCESSOR_INFORMATION;
    DWORD len = 0;
    CPUInfo* buffer = nullptr;

    // Determine required length of a buffer
    if ((GetLogicalProcessorInformation(buffer, &len) == FALSE) && (GetLastError() == ERROR_INSUFFICIENT_BUFFER))
    {
        // Allocate buffer of required size
        buffer = new (std::nothrow) CPUInfo[len]{ };

        if (buffer == nullptr)
        {
            std::cout << "Buffer allocation of " << len << " bytes failed" << std::endl;
        }
        else if (GetLogicalProcessorInformation(buffer, &len) != FALSE)
        {
            const DWORD count = len / sizeof(CPUInfo);
            for (DWORD i = 0; i < count; ++i)
            {
                // This will be true for multiple returned caches, we need just one
                if (buffer[i].Relationship == RelationCache)
                {
                    std::cout << "Cache line size is: " << buffer[i].Cache.LineSize << " bytes" << std::endl;
                    break;
                }
            }
        }
        else
        {
            std::cout << "ERROR: " << GetLastError() << std::endl;
        }

        delete[] buffer;
    }
}
metablaster
  • 1,958
  • 12
  • 26
0

I think you need NtQuerySystemInformation from ntdll.dll.

rami
  • 1,586
  • 12
  • 13
0

If supported by your implementation, C++17 std::hardware_destructive_interference_size would give you an upper bound (and ..._constructive_... a lower bound), taking into account stuff like hardware prefetch of pairs of lines.

But those are compile-time constants, so can't be correct on all microarchitectures for ISAs which allow different line sizes. (e.g. older x86 CPUs like Pentium III had 32-byte lines, but all later x86 CPUs have used 64-byte lines, including all x86-64. It's theoretically possible that some future microarchitecture will use 128-byte lines, but multi-threaded binaries tuned for 64-byte lines are widespread so that's perhaps unlikely for x86.)

For this reason, some current implementations choose not to implement that C++ feature at all. GCC does implement it, clang doesn't (Godbolt). It becomes part of the ABI when code uses it in struct layouts, so it's not something compilers can change in future to match future CPUs for the same target.


GCC defines both constructive and destructive as 64 x86-64, neglecting the destructive interference that adjacent-line prefetch can cause, e.g. on Intel Sandybridge-family. It's not nearly as disastrous as false sharing within a cache line in a high-contention case, so you might choose to only use 64-byte alignment to separate objects that different threads will be accessing independently.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847