How are percpu pointers implemented in the Linux kernel?

Question

On multiprocessor, each core can have its own variables. I thought they are different variables in different addresses, although they are in same process and have the same name.

But I am wondering, how does the kernel implement this? Does it dispense a piece of memory to deposit all the percpu pointers, and every time it redirects the pointer to certain address with shift or something?

score 27 · Accepted Answer · answered Jun 09 '13 at 14:48

27

Normal global variables are not per CPU. Automatic variables are on the stack, and different CPUs use different stack, so naturally they get separate variables.

I guess you're referring to Linux's per-CPU variable infrastructure.
Most of the magic is here (asm-generic/percpu.h):

extern unsigned long __per_cpu_offset[NR_CPUS];

#define per_cpu_offset(x) (__per_cpu_offset[x])

/* Separate out the type, so (int[3], foo) works. */
#define DEFINE_PER_CPU(type, name) \
    __attribute__((__section__(".data.percpu"))) __typeof__(type) per_cpu__##name

/* var is in discarded region: offset to particular copy we want */
#define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu]))
#define __get_cpu_var(var) per_cpu(var, smp_processor_id())

The macro RELOC_HIDE(ptr, offset) simply advances ptr by the given offset in bytes (regardless of the pointer type).

What does it do?

When defining DEFINE_PER_CPU(int, x), an integer __per_cpu_x is created in the special .data.percpu section.
When the kernel is loaded, this section is loaded multiple times - once per CPU (this part of the magic isn't in the code above).
The __per_cpu_offset array is filled with the distances between the copies. Supposing 1000 bytes of per cpu data are used, __per_cpu_offset[n] would contain 1000*n.
The symbol per_cpu__x will be relocated, during load, to CPU 0's per_cpu__x.
__get_cpu_var(x), when running on CPU 3, will translate to *RELOC_HIDE(&per_cpu__x, __per_cpu_offset[3]). This starts with CPU 0's x, adds the offset between CPU 0's data and CPU 3's, and eventually dereferences the resulting pointer.

answered Jun 09 '13 at 14:48

ugoren

16,023
3
35
65

Thanks for your ansser, but I still have some questions, new to smp, so, no offence to your idea. First, I thought the same process should have the same stack, here is thread definition in POSIX "...and automatic variables, are accessible to all threads in the same process.". Automatic variable are shared by threads. Different processor might have different stack segment register, but the content should be the same. Second, can we say that we can also access other cpu's variable if we want, just roll back the offset obtained by percpu? – dspjm Jun 17 '13 at 09:44
When two threads call function `foo`, which has an automatic variable `x`, there are two stacks and two instances of `x`. Each has a different address, and both threads can access both, if they have the address. With Linux's per-cpu variables, `per_cpu(var, cpu)` lets you access any cpu's variables. – ugoren Jun 18 '13 at 12:23
How does the .data.percpu section determine whether the percpu variable is declared on the stack or heap? – user31986 Dec 12 '13 at 02:51
1

durnig kenrel loading, each cpu has its own GDT table. every entry in this gdt table represent a memory segment that can be accessed from the thread on this CPU. GDT table entry 0 stores memory segement of per-cpu memory. TO access the per-cpu memory, linux uses gs:{variable-offset} to access the per-cpu variable like this: mov %gs:0x41(%rcx),%dl – Houcheng Mar 04 '14 at 09:32
Do dynamically allocated per-cpu variables fit into the picture of what @ugoren described above? Or do they use a completely different mechanism? – Alex D Mar 22 '15 at 19:12
@AlexD, as far as I know, this meachanism doesn't deal with dynamically allocated variables. – ugoren Mar 25 '15 at 10:08
3

I have been studying the code. It looks like things have changed a bit since this answer was written. `DEFINE_PER_CPU(int, x)` defines a symbol called `x`, not `per_cpu__x`. Rather than relocating `x` to CPU 0's copy when the kernel is loaded, `__per_cpu_offset[0]` holds the difference between the address which the linker assigned to `x` and where CPU 0's copy is actually stored. Also: rather than accessing `__per_cpu_offset` all the time, the x86 kernel stores that offset as the base of segment `fs`. Then dereferencing the pointer within segment `fs` goes directly to the current CPU's copy! – Alex D Mar 25 '15 at 18:02

How are percpu pointers implemented in the Linux kernel?

1 Answers1

Linked

Related