Linux disk cache and kmalloc with GFP_ATOMIC

Question

In some well-known article there is such statement about the Linux disk cache:

there's absolutely no reason to disable it!

Also:

A healthy Linux system with more than enough memory will, after running for a while, show the following expected and harmless behavior:

free memory is close to 0

used memory is close to total

available memory (or "free + buffers/cache") has enough room (let's say, 20%+ of total)

swap used does not change

These conditions are met in my case and there is a problem. I have some production kernel-mode networking code which has to allocate memory in "atomic" context (kmalloc() with GFP_ATOMIC flag set). So under the high load as expected while "free memory is close to 0" my code can't allocate memory and then eventually it becomes a denial-of-service.

Obviously cron with sync && echo 3 > /proc/sys/vm/drop_caches is not a solution due to disk performance issues. It's possible just to try to choose some set of files to turn off caching on them, however it does not seem the good and reliable solution.

The questions are:

What is the proper and reliable solution in such case? (from kernel-mode or user-mode side or both)
Why does it considered that there can be no reason to disable (reduce intensity of) disk cache?

Some posts that didn't help me: 1 and 2.

You are overly focused on disk cache. Forget about it. Instead go look at how the rest of the Linux kernel network code allocates memory. — Zan Lynx, Dec 26 '18 at 20:44
https://stackoverflow.com/questions/21374491/vm-min-free-kbytes-why-keep-minimum-reserved-memory — Zan Lynx, Dec 26 '18 at 20:46
And a possibly relevant quote, "The only time I've had to increase min_free_kbytes is after enabling jumbo frames on a network card that didn't support dma scattering." — Zan Lynx, Dec 26 '18 at 20:48
And sometimes you just have to tell the network sender "This host / socket / whatever is full. Go away." — Zan Lynx, Dec 26 '18 at 20:51
@ZanLynx Thanks for your attention! About the first and the last comments: this is some production code which **has to do it in atomic context** and should pass some tests. It can't be completely rewritten now :( — red0ct, Dec 26 '18 at 20:59
1) What is the problem you try to solve? 2) what are you afraid of ? — wildplasser, Dec 26 '18 at 20:59
@wildplasser 1) **Combine `kmalloc()` in atomic context with high loaded system while free memory is close to 0.** 2) I'm not afraid of some solution because right now I have no appropriate solution. — red0ct, Dec 26 '18 at 21:04
Increase min_free_kbytes as much as you need to. If that does not solve your problem then your problem is unsolvable because it would indicate that your network is accepting traffic faster than you process it. If that is the case nothing can help you and all is doomed to failure. — Zan Lynx, Dec 26 '18 at 21:20
Oh and if your real problem is that you need multiple-page blocks of memory, like Order 3 or 4, and as atomic allocations, then you're toast. That won't happen. You'll need to preallocate chunks that big in your own slab cache and make sure you refill the cache from non-atomic context when it gets low. The default kmalloc allocators will run out of multiple-page chunks very quickly. — Zan Lynx, Dec 26 '18 at 21:24
@ZanLynx I'll try the workaround about `min_free_kbytes`. BTW you can post it as answer :) It is processing traffic faster than accept it, the problem is just with allocating which fails under these particular conditions. — red0ct, Dec 26 '18 at 21:29
I can't back this with a reference right now, but my understanding is, that pages, which contain (non-dirty) cache data can indeed be allocated atomically, because the contents don't have to be written back to disk, hence there is no need to sleep. The contents can simply be discarded and the page can be allocated. — Ctx, Dec 26 '18 at 21:29
Good information here, only a little out of date: https://lwn.net/Kernel/LDD3/ especially Chapter 8. — Zan Lynx, Dec 26 '18 at 21:32
@Ctx I would appreciate the details about non-dirty cached pages re-allocation. — red0ct, Dec 26 '18 at 21:32
@jww The question is not only about Linux tuning. It is about the linux-kernel development too. — red0ct, Dec 26 '18 at 21:34
@red0ct Sorry, I can't find a reference which explicitly states this right now. But all descriptions for `kmalloc()` state, that it allocates pages without sleeping; dropping a clean page containing cached data does not require kmalloc to sleep. — Ctx, Dec 26 '18 at 21:50
But testing this theory should be easy... Fill the caches and load a module allocating single pages with GFP_ATOMIC until it fails. Then have a look at the free memory stats. — Ctx, Dec 26 '18 at 21:53

Linux disk cache and kmalloc with GFP_ATOMIC

0 Answers0

Linked