While "fork"ing a process, why does Linux kernel copy the content of kernel page table for every newly created process?

Question

The discussion below applies to 32-bit ARM Linux kernel.

I noticed that during the forking process, Linux kernel copies the content of kernel page table(master page table, i.e. swapper_pg_dir) into the page table of every newly created process.

Questions are:

Why bother doing that?
Why can't all processes share a single copy of kernel page table(higer 1G part regarding 32bit ARM Linux), instead of memcpy the swapper page table for each newly created process?
Is it a waste of memory?

Related source code("-->" stands for function call):
do_fork --> copy_process --> copy_mm --> dup_mm --> mm_init --> mm_alloc_pgd --> pgd_alloc -->

/*
* Copy over the kernel and IO PGD entries
*/
init_pgd = pgd_offset_k(0);

memcpy(new_pgd + USER_PTRS_PER_PGD, init_pgd + USER_PTRS_PER_PGD,
       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));

There is only one physical MMU; it is hardware. Each process needs a full Level 1 table (16K). For the kernel most are section/super-sections and are 1MB with no L2. Processes can share L2 entries. — artless noise, Dec 01 '14 at 22:54
Thanks for your comment. Do you mean that because there is only one physical MMU, if there is only one single copy of kernel master page table(swapper page table), MMU needs to be flushed every time there is a kernel/user mode switch? — CodingNow, Dec 02 '14 at 03:01
No, each process may have a contiguous 16k separate L1 table. When the table base is switched, then all L1-TLB's must be flushed. However, I think Linux only updates the *real* L1 page table and does a TLB flush for each update; on the ARM there is some fake page tables. Sorry, I didn't look at what swapper_pg_dir is; a real table or a pseudo-table. Linux has arch depend/independant code. — artless noise, Dec 02 '14 at 23:46
So my understanding is if **- all processes share the same single copy of kernel master page table for the higher 1GB part(kernel land) - all processes use separate page tables for the lower 3GB part(user land)** that means **processes do NOT have contiguous 16K L1 table, hence page table needs to be switched every time there is kernel-user switch.** Would you please kindly confirm it? Thank you! — CodingNow, Dec 03 '14 at 02:52
hi, from the source code, the kernel page tabel is not copied completely. Actually only the page global directory is copied, which means the process page table and kernel page table points to the same pud, pmd and pte. — keniee van, Jul 29 '17 at 11:53

score 4 · Accepted Answer · edited May 23 '17 at 10:24

4

Each process having its own copy of page table for kernel part(higher 1GB) is to avoid L1 page table switching(i.e. avoid updating TTBR) when user/kernel land is being switched. Note that user/kernel land switch happens quite frequently.

Why avoiding updating TTBR? Details can be found here: What is the downside of updating ARM TTBR(Translate Table Base Register)?

edited May 23 '17 at 10:24

Community

1
1

answered Dec 05 '14 at 03:11

CodingNow

998
1
11
23

score 1 · Answer 2 · answered Dec 01 '14 at 05:25

1

Sharing page tables means sharing memory space. In other words it defeats the point of having an operating system. Each process has its own page tales. Page tables do not use much memory.

answered Dec 01 '14 at 05:25

Mike Matera

69
1

2

Thanks for the answer. My point here is that why copying the "content" of master page table into every newly created process's page table? Why cannot every process share the same kernel page table(higher 1G part) – CodingNow Dec 01 '14 at 05:28

While "fork"ing a process, why does Linux kernel copy the content of kernel page table for every newly created process?

2 Answers2

Linked