1

This question was inspired by

Prevent malloc function wrapping

I have written a program that allocates memory for individual nodes in a linked list by calling malloc.

There are speed tests in which malloc is wrapped by a function that causes malloc calls to take more time than normal. This allows the tests to detect frequent usage of malloc.

How do I avoid calling malloc for each node?

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/194138/discussion-on-question-by-david-cullen-how-to-avoid-calling-malloc-for-each-node). – Samuel Liew May 29 '19 at 23:21

5 Answers5

6

There are speed tests in which malloc is called by wrapped function which is written to take more time and allocate memory. So every time when I call malloc in my pogram it is called but it takes more time so the tests can detect very often usage of malloc. The problem is that I use linked lists so the memory is allocated separately for every node of the list. I don't know how to change this implementation because it is really comfortable to use linked lists in my structure.

You might be able to use an array instead.

For a simple example:

#include <stdio.h>
#include <stdlib.h>

struct list_entry {
    struct list_entry *next;
    int foo;
};

#define MAX_THINGS 1234567
struct list_entry myArray[MAX_THINGS];
int firstFreeEntry = 0;

struct list_entry *freeListHead = NULL;

struct list_entry *listHead = NULL;

struct list_entry *allocEntry(void) {
    struct list_entry *temp;

    if(freeListHead != NULL) {
        // Recycle a previously freed entry
        temp = freeListHead;
        freeListHead = temp->next;
        return temp;
    }
    // Try to take a new entry from the array
    if(firstFreeEntry < MAX_THINGS) {
        return &myArray[firstFreeEntry++];
    }
    // Give up (no free entries)
    return NULL;
}

void freeEntry(struct list_entry *entry) {
    int offset;

    // Try to give it back to the array
    offset = entry - myArray;
    if(offset == firstFreeEntry - 1) {
        firstFreeEntry--;
        return;
    }
    // Put it on the list of freed things
    entry->next = freeListHead;
    freeListHead = entry;
}

// Allocate an entry, initialize/construct it, and put it on the linked list

struct list_entry *createEntry(int value) {
    struct list_entry *newEntry;
    newEntry = allocEntry();
    if(newEntry != NULL) {
        newEntry->foo = value;
        newEntry->next = listHead;
        listHead = newEntry;
    }
    return newEntry;
}

int main() {
    const int node_count = 1000 * 1000;
    struct list_entry *head = NULL;
    for (int index = 0; index < node_count; index++) {
        head = createEntry(0xdeadbeef);
        printf("address of head = %p\n", head);
    }
    while (head) {
        struct list_entry *next = head->next;
        printf("address of head = %p\n", head);
        freeEntry(head);
        head = next;
    }
    return 0;
}

Output

address of head = 0x101d32040
address of head = 0x101d32050
address of head = 0x101d32060
...
address of head = 0x101d32040

Verification

$ ./test > storage.txt
$ split -l 1000000 storage.txt
$ tac xab > xac
$ diff xaa xac
Brendan
  • 35,656
  • 2
  • 39
  • 66
  • That's called a "cursor" implementation. – jww May 29 '19 at 21:48
  • Brendan, I'll ping you on this as I happened to mention you based on some research: https://stackoverflow.com/questions/56384291/what-happens-to-a-startup-ipi-sent-to-an-active-ap-that-is-not-in-a-wait-for-sip – Michael Petch May 31 '19 at 15:18
3

A simple solution is to implement alternative dynamic memory functions using mmap().

void* alt_malloc( size_t size )
{
    void* mem = mmap( NULL, size + sizeof(size_t),
                      PROT_READ | PROT_WRITE, 
                      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0 ) ;

    if( mem != MAP_FAILED )
    {
        *(size_t*)mem = size ;
    }
    else
    {
        mem = NULL ;
    }

    return mem + sizeof( size_t) ;
}

void* alt_calloc( size_t nitems, size_t size)
{
    return alt_malloc( nitems * size ) ;
}

void alt_free( void* mem )
{
    if( mem != NULL) munmap( mem, *((size_t*)mem - 1) ) ;
} 

void* alt_realloc( void *old_mem, size_t new_size )
{
    void* new_mem = alt_malloc( new_size ) ;
    if( new_mem != NULL )
    {
        size_t old_size = *((size_t*)old_mem - 1) ;
        size_t copy_size = old_size > new_size ? new_size : old_size ;
        memcpy( new_mem, old_mem, copy_size ) ;
        alt_free( old_mem ) ;
    }   

    return new_mem ;
}

The following test:

#define ALLOC_SIZE 1024 * 1024
int main()
{
    char* test = alt_malloc( ALLOC_SIZE ) ;
    memset( test, 'X', ALLOC_SIZE ) ;
    printf( "%p : %c %c\n", test, test[0], test[ALLOC_SIZE-1] ) ;
    test = alt_realloc( test, ALLOC_SIZE * 2 ) ;
    printf( "%p : %c %c\n", test, test[0], test[ALLOC_SIZE-1] ) ;
    alt_free( test ) ;

    return 0;
}

Outputs:

0x7f102957d008 : X X
0x7f1028ea3008 : X X

Demonstrating that the memset() covered the extent of the initial block, and that the realloc created a new block and copied the data.

A more efficient, though slightly more complex solution would be to use mmap() to allocate an alternative heap, and then implement a heap manager operating on that block as an alternative to the standard functions. There is no shortage of heap management examples.

You could for example take the Newlib C library allocator with modified names and implement the sbrk() syscall (again renamed to prevent clash) using mmap() to provide memory to the alternate heap.

Clifford
  • 88,407
  • 13
  • 85
  • 165
  • 1
    I don't see how it is connected to linked list memory to be honest – Jean-François Fabre May 29 '19 at 19:37
  • 2
    Note that `mmap()` returns memory in page-sized chunks and only in page-sized chunks, usually 4KB on x86 systems. And a single `mmap()` call is likely going to be a lot slower than a single `malloc()` call, especially for small sizes. – Andrew Henle May 29 '19 at 19:43
  • 1
    @Jean-FrançoisFabre : You are right, the answer appears to have been moved from the original question which stated _"I have read something about mmap function but could't find any good example of it's usage instead of malloc."_. This new question only describes the original problem, not the proposed solution he was having troubles with. – Clifford May 29 '19 at 20:08
  • 1
    @AndrewHenle Yes, it is wasteful, but I assumed only required for the test environment. Although it is a bizarre test IMO, a wrapper could simply log timestamp and block size of each allocation to analyse dynamic memory usage. – Clifford May 29 '19 at 20:11
  • 1
    @Clifford my fault: I merged and it has consequences like this (most moderators wouldn't want to try merge for this reason). Wanna try to adapt the answer? – Jean-François Fabre May 29 '19 at 20:34
  • @Clifford how can I get an implementation of the realloc function in this style? – avan1235 May 29 '19 at 21:02
  • 1
    @avan1235 : The naive way is to allocate a new block copy the data from the old block, delete the d block and return the new block. That could be optimised through knowledge of the page size and the actual amount of mapped memory to avoid reallocation in many cases. – Clifford May 29 '19 at 21:08
  • @Clifford I don't know these functions, can you give an example of some implementation? How to do it in naive way? – avan1235 May 29 '19 at 21:11
  • 1
    @avan1235 it is not really a matter of knowing these functions, rather simply one of understanding a functions behaviour and implementing it. – Clifford May 29 '19 at 21:47
  • @Clifford so what about: void* reallocate_memory(void *old, size_t size, size_t oldSize) { void *newMem = allocate_memory(size); if (newMem == NULL) return NULL; newMem = memcpy(newMem, old, oldSize); if (newMem != NULL) free_memory(old); return newMem; } – avan1235 May 29 '19 at 21:48
  • 1
    @avan1235 Your suggestion has a different signature than realloc. The allocation size is prepended to the memory returned to the caller, so there is no need to provide the old size. – Clifford May 29 '19 at 22:00
  • @Clifford partially working, don;t know why get SIGSEGV error in some point when replaced malloc, free and realloc functions wit your implementations – avan1235 May 29 '19 at 22:12
  • 1
    @avan1235 SIGSEGV is generated on an attempt to write into a region mapped as read-only. – Clifford May 29 '19 at 22:23
  • When I use the commented version of functions everything works well, but after switch to your modified version something wents wrong `void* allocate_memory(size_t size) { void* mem = mmap( NULL, size + sizeof(size_t), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0 ); return mem == MAP_FAILED ? NULL : mem + sizeof(size_t); // return malloc(size); }` – avan1235 May 29 '19 at 22:30
  • 1
    @avan1235 : don't post code in comments - it is entirely unreadable. realloc is not correct - working on it. – Clifford May 29 '19 at 22:34
  • @avan1235 Are calls to `mmap` wrapped? –  May 29 '19 at 22:40
  • No, only malloc, calloc realloc, alloc_array – avan1235 May 29 '19 at 22:42
  • 1
    @avan1235 fixed. Forgot to write the allocation size in the mmap'ed block in `alt_alloc()` causing `alt_free()` to do nothing and `alt_realloc()` to fail. See test code run at https://onlinegdb.com/B1Nm2K3pE – Clifford May 29 '19 at 22:45
  • @Clifford still gets some problems when freeing the NULL pointers – avan1235 May 30 '19 at 05:41
  • When check `if (mem != NULL) munmap(mem, *((size_t*) mem - 1));` everything works well :) – avan1235 May 30 '19 at 05:51
  • @avan1235 Good point if you need `alt_free()` to follow the exact semantics of `free()`, but if you are habitually trying to free null pointers, I'd question the correctness of your code in any case. – Clifford May 30 '19 at 08:53
2

This program allocates memory for nodes in a linked list in contiguous chunks. When the memory in a chunk is exhausted, a new chunk is allocated.

#include <stdio.h>
#include <stdlib.h>

// An imaginary node because the original question did not provide one
struct node {
    struct node *next, *prev;
    int numbers[1024];
    char string[1024];
};

struct node_storage {
    int count;
    int total;
    struct node *node_list;
    struct node_storage *next;
};

struct node_storage *add_storage(int count) {
    struct node_storage *pstorage = malloc(sizeof(struct node_storage));
    // We could avoid a malloc here by making node_list an array
    pstorage->node_list = malloc(sizeof(struct node) * count);
    pstorage->count = count;
    pstorage->total = count;
    pstorage->next = NULL;
    return pstorage;
}

void free_storage(struct node_storage *storage) {
    while (storage) {
        struct node_storage *next = storage->next;
        free(storage->node_list);
        free(storage);
        storage = next;
    }
}

struct node *new_node(struct node **free_list, struct node_storage **storage) {
    struct node *free_node = *free_list;
    struct node_storage *pstorage = *storage;
    struct node *result;
    // If there is a free node
    if (free_node) {
        // Get the new node from the free list
        result = free_node;
        *free_list = free_node->next;
    }
    else {
        // Get the new node from the pre-allocated storage
        result = &pstorage->node_list[pstorage->total - pstorage->count];
        pstorage->count--;
        // If that was the last pre-allocated node
        if (0 == pstorage->count) {
            // Allocate the next chunk of nodes
            pstorage->next = add_storage(pstorage->total);
            *storage = pstorage->next;
        }
    }
    return result;
}

void free_node(struct node **free_list, struct node *node) {
    struct node *pfree_list = *free_list;
    if (pfree_list) {
        node->next = pfree_list;
    }
    *free_list = node;
}

int main() {
    const int node_count = 1000 * 1000;
    struct node_storage *head;
    struct node_storage *curr;
    struct node *free_list = NULL;
    struct node *node_list = NULL;
    head = add_storage(100);
    curr = head;
    // Allocate a lot of nodes and put them in a list
    for (int index = 0; index < node_count; index++) {
        struct node *new = new_node(&free_list, &curr);
        printf("address of new = %p\n", new);
        new->next = node_list;
        node_list = new;
    }
    // Free all of the allocated nodes
    while (node_list) {
        struct node *pnode = node_list;
        node_list = node_list->next;
        free_node(&free_list, pnode);
    }
    // Allocate a lot of nodes so that we can verify that they come from
    // the free list
    for (int index = 0; index < node_count; index ++) {
        struct node *new = new_node(&free_list, &curr);
        printf("address of new = %p\n", new);
    }
    free_storage(head);
    return 0;
}

Output

address of new = 0x10f972000
address of new = 0x10f973410
...
address of new = 0x243570230

Warning

This code does no error checking and should not be used in a production environment.

Note

I modified the code so that a freed node is placed on a free list. That list is checked first when requesting a new node. I tested this by comparing the addresses of the nodes like this:

$ ./test > storage.txt
$ split -l 1000000 storage.txt
$ diff xaa xab
2

This is similar to the answer from @Brendan but has no fixed limit to the number of nodes. It is derived from code I already wrote.

When a node is released it is placed in a linked list pool. When a node is needed it is taken from the pool (if any) or from an array (if any) or the array is extended, not by just one node, but by a large number of them. This reduces the number of calls to realloc.

#include <stdio.h>
#include <stdlib.h>

#define CHUNK 8192

typedef struct Node {
    int data;
    struct Node *next;
} node;

node *array;
node *pool;
int nodes_avail;
int nodes_used;

node *getnode(void)
{
    node *newnode;
    if(pool) {                              // any in the recycled pool?
        newnode = pool;
        pool = pool->next;
    }
    else if(nodes_used < nodes_avail) {     // any in the array?
        newnode = &array[nodes_used];
        nodes_used++;
    }
    else {                                  // extend the array?
        nodes_avail += CHUNK;
        node *temp = realloc(array, nodes_avail * sizeof *temp);
        if(temp == NULL) {
            exit(1);                        // or recovery
        }
        array = temp;
        newnode = &array[nodes_used];
        nodes_used++;
    }
    return newnode;
}

void freenode(node *nptr)
{
    nptr->next = pool;                      // add to recycled pool
    pool = nptr;
}

int main() {
    const int node_count = 1000 * 1000;
    node *head = NULL;
    for (int index = 0; index < node_count; index++) {
        node *new = getnode();
        new->next = head;
        head = new;
        printf("address of head = %p\n", head);
    }
    while (head) {
        node *next = head->next;
        freenode(head);
        head = next;
    }
    for (int index = 0; index < node_count; index++) {
        node *new = getnode();
        new->next = head;
        head = new;
        printf("address of head = %p\n", head);
    }
    return 0;
}

Output

address of head = 0x100bc7000
address of head = 0x100bc7010
address of head = 0x100bc7020
...
address of head = 0x101b2a3f0

Verification

$ ./test > storage.txt
$ split -l 1000000 storage.txt
$ diff xaa xab
Weather Vane
  • 33,872
  • 7
  • 36
  • 56
  • @DavidCullen why the edit? Was there something wrong? Global variables do not need to be explicitly initialised to `0`. – Weather Vane May 29 '19 at 23:18
  • I'm just trying to convert each answer into an MCVE. If you dislike my changes, feel free to undo them. –  May 29 '19 at 23:22
  • @DavidCullen I usually like to post a fully working example but could not see how to easily make a dynamically random get/free node test. My answer was a similar idea to that of [Kaz](https://stackoverflow.com/a/56367126/4142924) too. I wondered from the last line of your verification if there is something wrong with my code. Note that your `new->next = head;` should be irrespective of the value of `head`: the first node should have its `next = head;` which is initially `NULL`. – Weather Vane May 29 '19 at 23:27
1

The simplest thing you can do is continue to call malloc for the individual nodes of the linked list, but when nodes are freed, put them on a list of free nodes.

node *free_list = 0;

node *node_alloc(void)
{
   /* get a node fast from the free list, if available */
   if (free_list != 0) {
      node *ret = free_list;
      free_list = free_list->next;
      return ret;
   } else {
      node *ret = malloc(sizeof *ret);
      /* check for non-null, initialize */
      return ret;
   }
}

void node_free(node *node)
{
   node->next = free_list;
   free_list = node;
}

If your program has some objects that are only on one list at any given time, you can put the list nodes right inside those objects, so that they don't require separate allocation:

struct process {
   node queue_node; /* list links for putting process into queues */
   ...
};
Kaz
  • 55,781
  • 9
  • 100
  • 149
  • As a not-too-big extension of this, when new memory needs to be allocated, several nodes could be allocated at once and put on the free node list, instead of allocating nodes one at a time. – John Bollinger May 29 '19 at 20:22