How to get physical and virtual address bits with C/C++ by CPUID command

Question

I'm getting physical and virtual address bits size with C by using CPUID command in windows. I can get the processor information this way, but I'm confused by getting the address bits. Looks like I should you the 80000008 instruction but I do this way, only 7-8 digits change continuously are displayed. I want to learn how this command works and solve this problem

#include <stdio.h>

void getcpuid(int T, int* val) {
    int reg_ax;
    int reg_bx;
    int reg_cx;
    int reg_dx;
    __asm {
        mov eax, T;
        cpuid;
        mov reg_ax, eax;
        mov reg_bx, ebx;
        mov reg_cx, ecx;
        mov reg_dx, edx;
    }
    *(val + 0) = reg_ax;
    *(val + 1) = reg_bx;
    *(val + 2) = reg_cx;
    *(val + 3) = reg_dx;
}

int main() {
    int val[5]; val[4] = 0;
    getcpuid(0x80000002, val);
    printf("%s\r\n", &val[0]);
    getcpuid(0x80000003, val);
    printf("%s\r\n", &val[0]);
    getcpuid(0x80000004, val);
    printf("%s\r\n", &val[0]);
    return 0;
}

when operate this code with putting EAX = 80000002, 80000003, 80000004, Intel processor brand string was displayed. And I put 80000008 To getting physical and virtual address bits but random numbers changing constantly was displayed. I want to know how to use this cpuid commend with 80000008 to get those address bits

i'm programming and operating system beginner. Please let me know what I have to do.

Is val a string? what did you expect to get and what did you actually get? — Surt, Oct 24 '20 at 13:22
I got the processor informations and print those by val. I used getcpuid function 3times with 0x80000002, 0x80000003, 0x80000004, processor name was displayed — kkkkkk, Oct 24 '20 at 13:44
Windows has a [cpuid intrinsic](https://learn.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=vs-2019) that is probably a better option than writing your own. — Nate Eldredge, Oct 24 '20 at 15:15
I don't understand what "only 7-8 digits change continuously are displayed." means. Note that your current code is trying to print the resulting values as a string, which makes some sense for the processor ID, but none at all for the address sizes which are packed 8-bit binary integers. If you're using different code for the 0x80000008 case, then please post **that** code so we can see what you're actually talking about. — Nate Eldredge, Oct 24 '20 at 15:17
Now I know what I wrong and how I fix this code approximately thanks to your comments, I'm having a hard time with this problem. Specifically, are you saying getcpuid function I wrote and the way I get address bit with 80000008 are right? then the wrong way is expressing address bits by array like -int value[5]-? According this assumption, what can I do for expressing those address bits? I can't understand easily to expressing what I want with this way. — kkkkkk, Oct 24 '20 at 16:40

Brendan · Accepted Answer · 2020-10-25T11:25:50.637

The inline assembly you're using may be right; but this depends on which compiler it is. I think it is right for Microsoft's MSVC (but I've never used it and can't be sure). For GCC (and CLANG) you'd have to inform the compiler that you're modifying the contents of registers and memory (via. a clobber list), and it would be more efficient to tell the compiler that you're outputting 4 values in 4 registers.

The main problem is that you're trying to treat the output as a (null terminated) string; and the data returned by CPUID is never a null terminated string (even for "get vendor string" and "get brand name string", it's a whitespace padded string with no zero terminator).

To fix that you could:

void getcpuid(int T, int* val) {
    unsigned int reg_ax;
    unsigned int reg_bx;
    unsigned int reg_cx;
    unsigned int reg_dx;
    __asm {
        mov eax, T;
        cpuid;
        mov reg_ax, eax;
        mov reg_bx, ebx;
        mov reg_cx, ecx;
        mov reg_dx, edx;
    }
    *(val + 0) = reg_ax;
    *(val + 1) = reg_bx;
    *(val + 2) = reg_cx;
    *(val + 3) = reg_dx;
}

int main() {
    uint32_t val[5]; val[4] = 0;
    getcpuid(0x80000002U, val);
    printf("0x%08X\r\n", val[0]);
    getcpuid(0x80000003U, val);
    printf("0x%08X\r\n", val[1]);
    getcpuid(0x80000004U, val);
    printf("0x%08X\r\n", val[2]);
    return 0;
}

The next problem is extracting the virtual address size and physical address size values. These are 8-bit values packed into the first and second byte of eax; so:

int main() {
    uint32_t val[5]; val[4] = 0;
    int physicalAddressSize;
    int virtualAddressSize;

    getcpuid(0x80000008U, val);
    physicalAddressSize = val[0] & 0xFF;
    virtualAddressSize= (val[0] >> 8) & 0xFF;

    printf("Virtual %d, physical %d\r\n", virtualAddressSize, physicalAddressSize);
    return 0;
}

That should work on most recent CPUs; which means that it's still awful and broken on older CPUs.

To start fixing that you want to check that the CPU supports "CPUID leaf 0x80000008" before you assume it exists:

int main() {
    uint32_t val[5]; val[4] = 0;
    int physicalAddressSize;
    int virtualAddressSize;

    getcpuid(0x80000000U, val);
    if(val(0) < 0x80000008U) {
        physicalAddressSize = -1;
        virtualAddressSize = -1;
    } else {
        getcpuid(0x80000008U, val);
        physicalAddressSize = val[0] & 0xFF;
        virtualAddressSize= (val[0] >> 8) & 0xFF;
    }
    printf("Virtual %d, physical %d\r\n", virtualAddressSize, physicalAddressSize);
    return 0;
}

You can return correct results when "CPUID leaf 0x80000008" doesn't exist. For all CPUs that don't support "CPUID leaf 0x80000008"; virtual address size is 32 bits, and the physical address size is either 36 bits (if PAE is supported) or 32 bits (if PAE is not supported). You can use CPUID to determine if the CPU supports PAE, so it ends up a bit like this:

int main() {
    uint32_t val[5]; val[4] = 0;
    int physicalAddressSize;
    int virtualAddressSize;

    getcpuid(0x80000000U, val);
    if(val(0) < 0x80000008U) {
        getcpuid(0x00000000U, val);
        if(val[0] == 0) {
            physicalAddressSize = 32;          // "CPUID leaf 0x00000001" not supported
        } else {
            getcpuid(0x00000001U, val);
            if( val[3] & (1 << 6) != 0) {
                physicalAddressSize = 36;      // PAE is supported
            } else {
                physicalAddressSize = 32;      // PAE not supported
            }
        }
        virtualAddressSize = 32;
    } else {
        getcpuid(0x80000008U, val);
        physicalAddressSize = val[0] & 0xFF;
        virtualAddressSize= (val[0] >> 8) & 0xFF;
    }
    printf("Virtual %d, physical %d\r\n", virtualAddressSize, physicalAddressSize);
    return 0;
}

The other problem is that sometimes CPUID is buggy; which means that you have to trawl through every single errata sheet for every CPU (from Intel, AMD, VIA, etc) to be sure the results from CPUID are actually correct. For example, there are 3 models of "Intel Pentium 4 Processor on 90 nm Process" where "CPUID leaf 0x800000008" is wrong and says "physical addresses are 40 bits" when they are actually 36 bits.

For all of these cases you need to implement work-arounds (e.g. get the CPU vendor/family/model/stepping information from CPUID and if it matches one of the 3 buggy models of Pentium 4, do an "if(physicalAddressSize == 40) physicalAddressSize = 36;" to fix the CPU's bug).

Thank you so much for your answer. I find it very difficult to study these things these days, and I feel that doing it alone is much more difficult. I am really grateful for the help. It's only been one day since I signed up for Stack Overflow, but I'm enjoying learning. There are also parts that I still don't understand, and I'm sorry, but I'd like to ask you some more questions. Bugs depending on the specific brand and model of cpu and how to fix them seems a bit difficult for me right now. — kkkkkk, Oct 25 '20 at 10:40
Actually, It seems that it will take a long time, and I will first ask a question that is really difficult and curious about me. I am currently working on MSVC(IDE) and using Windows 8, my CPU is using 22nm Intel i5-4670(3.40 GHz) processor. I found out that I made a complete mistake in how to express the value obtained using the cpuid function(80000008), From the standpoint of studying programming and operating systems, I learned a very simple and innovative new way. But actually, I wonder again. — kkkkkk, Oct 25 '20 at 10:40
__https://www.sandpile.org/x86/cpuid.htm__ On the page above, when looking at the information in the paragraph , If you put the value in EAX and execute the CPUID instruction, physical address bits up to 0-7 bits, virtual address bits up to 8-15 bits It is said to display information of. For example, suppose it got the value of 12345678, 56 is the virtual address, 78 is the physical address. I think it is to display. — kkkkkk, Oct 25 '20 at 10:40
But in the second code you modified, you are using a value of 0x80000002, and in this state When compiled, the result -virtual 67, physical 32- is displayed. But if the size of virtual address bits is 67 I don't think. It would be a very good situation if this was your typo, I think my cpu is a buggy(feel this probability is low) or my getcpuid function is bug(@_@). Even if change the input value of the second modified code to 80000008 or print the value with 3rd or 4th code, both virtual and physical address bits are output as 0. I'm waiting for your thankful comment. — kkkkkk, Oct 25 '20 at 10:55
@kkkkkk: Sorry - that was my fault. I cut&pasted the `0x80000002` from the original, failed to change it to `0x80000008` in the second example (but did change it in the third example).:-) — Brendan, Oct 25 '20 at 11:21
@kkkkkk: Ok - I found the other bug. The address sizes are in `eax` (not `ecx` as I originally had it), so I was getting the `physicalAddressSize ` and `virtualAddressSize ` from the wrong variable (from `val[2]` when I should've got them from `val[0]`). Hopefully it'll work now :-) — Brendan, Oct 25 '20 at 11:27
Just, the code modified at 3rd work. I understand this modification and I change val[2] to val[0] in main function. then virtual :48, physical 39 are displayed. But I think these digits don't make sense. Please let me know your opinion. — kkkkkk, Oct 25 '20 at 11:39
@kkkkkk: "virtual: 48, physical: 39" sounds right to me. All older 64-bit CPUs will report "virtual: 48" (Ice lake and newer Intel CPUs will/should report 57-bits due to a recent extension - see https://en.wikipedia.org/wiki/Intel_5-level_paging ). For physical address size, anything from 32 bits to 52 bits is possible. 39 bits is enough for 512 GiB (e.g. 256 GiB of RAM and 256 GIB of space for memory mapped devices), but Intel says that Core i5-4670 only supports 32 GiB of RAM so 39 bits is plenty . — Brendan, Oct 25 '20 at 12:18
comments thanks. I got a new information useful !. I was bound by certain number because of my old and foolish knowledge. I will develop and study these things until I die thanks to your kindly and clear answer. Someday surely I pay it forward to others. — kkkkkk, Oct 25 '20 at 12:34

How to get physical and virtual address bits with C/C++ by CPUID command

1 Answers1

Linked