Shouldn't malloc be asynchronous?

Question

Am I correct to assume that when a process calls malloc, there may be I/O involved (swapping caches out etc) to make memory available, which in turn implies it can block considerable time? Thus, shouldn't we have two versions of malloc in linux, one say "fast_malloc" which is suitable for obtaining smaller chunks & guaranteed not to block (but may of course still fail with OUT_OF_MEMORY) and another async_malloc where we could ask for arbitrary-size space but require a callback?

Example: if I need a smaller chunk of memory to make room for an item in the linked-list, I may prefer the traditional inline malloc knowing the OS should be able to satisfy it 99.999% of the time or just fail. Another example: if I'm a DB server trying to allocate a sizable chunk to put indexes in it I may opt for the async_malloc and deal with the "callback complexity".

The reason I brought this up is that I'm looking to create highly concurrent servers handling hundreds of thousands of web requests per second and generally avoid threads for handling the requests. Put another way, anytime I/O occurs I want it to be asynchronous (say libevent based). Unfortunately I'm realizing most C APIs lack proper support for concurrent use. For example, the ubiquitous MySQL C library is entirely blocking, and that's just one library my servers use extensively. Again, I can always simulate non-blocking by offloading to another thread but that's nowhere near as cheap as waiting for result via completion callback.

Calling `malloc` will not inherently cause more IO. Perhaps you are confusing use of the memory returned versus just allocating the memory to you. Just because you ask for 100MB does not mean that `malloc` will immediately trigger 100MB of swapping. That only happens when you *access* the memory. — kaylum, Oct 07 '16 at 03:25
C does not specify the timing performance of `malloc()`. Certainly this is not the first application to have timing concerns. Typical `malloc()` wohtin a major OS does avoid blocking for long times and instead only allocates on use. http://stackoverflow.com/q/19991623/2410359. — chux - Reinstate Monica, Oct 07 '16 at 03:41

score 3 · Answer 1 · answered Oct 07 '16 at 03:41

As kaylum said in a comment:

Calling malloc will not inherently cause more IO. Perhaps you are confusing use of the memory returned versus just allocating the memory to you. Just because you ask for 100MB does not mean that malloc will immediately trigger 100MB of swapping. That only happens when you access the memory.

If you want to protect against long delays for swapping, etc. during subsequent access to the allocated memory, you can call mlock on it in a separate thread (so your process isn't stalled waiting for mlock to complete). Once mlock has succeeded, the memory is physically instantiated and cannot be swapped out until munlock.

score 2 · Answer 2 · answered Oct 07 '16 at 06:58

Remember that a call to malloc() does not necessarily result in your program asking the OS for more memory. It's down to the C runtime's implementation of malloc().

For glibc malloc() merely (depending on how much you're asking for) returns a pointer to memory that the runtime has already got from the OS. Similarly free() doesn't necessarily return memory to the OS. It's a lot faster that way. I think glibc's malloc() is thread safe too.

Interestingly this gives C, C++ (and everything built on top) the same sort of properties normally associated with languages like Java and C#. Arguably building a runtime like Java or C# on top of a runtime like glibc means that there's actually more work than necessary going on to manage memory... Unless they're not using malloc() or new at all.

There's various allocators out there, and you can link whichever one you want into your program regardless of what your normal C runtime provides. So even on platforms like *BSD (which are typically far more traditional in their memory allocation approach, asking the OS each and every time you call malloc() or new) you can pull off the same trick.

*I think glibc's malloc() is thread safe too.* That's true. Actually glibc's `malloc` implementation manages a thread local (private) arena for each thread, so that there is almost no collisions when multiple threads calling `malloc` simultaneously. — Jorge Bellon, Nov 14 '16 at 18:14

score 1 · Answer 3 · answered Oct 07 '16 at 07:39

Put another way, anytime I/O occurs I want it to be asynchronous (say libevent based).

I have bad news for you. Any time you access memory you risk blocking for I/O.

malloc itself is quite unlikely to block because the system calls it uses just create an entry in a data structure that tells the kernel "map in some memory here when it's accessed". This means that malloc will only block when it needs to go down to the kernel to map more memory and either the kernel is out of memory so that it itself has to wait for allocating its internal data structure (you can wait for quite a while then) or you use mlockall. The actual allocating of memory that can cause swapping doesn't happen until you touch memory. And your own memory can be swapped out at any time (or your program text could be paged out) and you have pretty much no control over it.

Shouldn't malloc be asynchronous?

3 Answers3