0

First, this question is not for those who see themselves as officers in the C++ police as it involve some serious C bending to squeeze back a bit of memory so please read this question with your vigilante hat on.

I have a program with many many strings allocated with malloc where most of them are 1 char long. A string that contains 1 char takes about ~32 bytes of real memory including the descriptors, block size etc. So my cunning plan is to use the char * pointer to store a char or a string by this:

char *sourceStringOrChar;

sourceStringOrChar = malloc(10);
//-- OR
sourceStringOrChar = (char *)'A';    //-- Note that its '' and not ""

if((intptr_t)sourceStringOrChar & (~0xFF))
    TRACE("%s\n", sourceStringOrChar);
else
    TRACE("%c\n", sourceStringOrChar);

I already know that malloc returns ENOMEM when its out of memory where the real value of this constant is 12. That gives me some hope that there is a vacancy in malloc returned results. I've started to read the source code of malloc to determine if this is possible, but if someone here have some knowledge on the deep-end of malloc it might save me a lot of time.

Edit:

  • The question of course is whether this is possible.
  • Some of you are worried on free/strlen etc. but please note that this is an example code and it can be handled the same as the above with the

    if((intptr_t)sourceStringOrChar & (~0xFF))
    {
        strlen(sourceStringOrChar);
        free(sourceStringOrChar);
    }
    
  • Also, I wouldn't go in that dangerous path if there wasn't a big memory issue

Moav
  • 21
  • 4
  • What's the point of doing this? Does your program have some unusual memory constraints? Also, if sometimes your pointers point to a NUL-terminated string and sometimes they point to a single `char` without NUL terminator, how will your program know which is which? – Bob Jarvis - Слава Україні Mar 05 '17 at 13:49
  • 1
    you have a memory leak up there, watch out boy!! – Seek Addo Mar 05 '17 at 13:50
  • 3
    Or... you could allocate a large chunk of memory for individual characters. Have the pointer point into that buffer, and simply check if the address is in the range of this buffer. Thus eliminating some of the hackery. – StoryTeller - Unslander Monica Mar 05 '17 at 13:52
  • your solution seems endian dependent. – Jean-François Fabre Mar 05 '17 at 13:52
  • `sourceStringOrChar = (char *)'A'` means the pointer value is 65, it's really that you want? You should not derefence it. How do you use a 'bad' pointer and distinguish it from a valid pointer on a string? – Aubin Mar 05 '17 at 13:54
  • 1
    For any string shorter than `sizeof(void*)`, dealing with pointers gives you an overhead of more than 100%. Would it be feasible to implement some sort of bucket system? E.g. for any string with length > 8 bytes, use your current approach, but for anything shorter, have a separate bucket for each length. Every bucket for length N would consist of a a number indicating the amount of strings it holds, and a large contiguous memory region where all strings are stored on an N-byte boundary. – Siguza Mar 05 '17 at 13:55
  • 7
    malloc returns `NULL`, not `ENOMEM` when out of memory. `ENOMEM` might be stored in `errno`. – Jens Mar 05 '17 at 13:58
  • How is code going to free the pointers? With some sort of prior test like `(intptr_t)sourceStringOrChar & (~0xFF)`? – chux - Reinstate Monica Mar 05 '17 at 14:20
  • @StoryTeller, this is an interesting take on this. To have a global chunk 512 bytes long that contains one char followed by NULL for each of the 256 chars possible and make *sourceStringOrChar* point to the specific char there. It might add some cumbersome but I'll definitely think about it. – Moav Mar 05 '17 at 14:24
  • @Moav - Not 512. But 256. If the pointer points into the buffer, then you print with `TRACE("%c\n", *sourceStringOrChar);` because you know it's an individual character. – StoryTeller - Unslander Monica Mar 05 '17 at 14:26
  • What specifically is the question? – Paul Hankin Mar 05 '17 at 14:37
  • 3
    `union { char * ptr; char buf[4]; }` is probably a better trick assuming 32-bit pointers. If little-endian, you can represent a 1 or 2 character string as [255, x, y, 0], and `#define GET_STR(x) ((x).buf[0] == 255 ? (x).buf+1 : (x).ptr)` then `printf("%s", GET_STR(s))` avoids littering your code with explicit conditionals. Although it uses an assumption that `malloc` never returns an odd address. – Paul Hankin Mar 05 '17 at 14:46
  • @Paul - C++ police here - You are more or less reinventing the small string optimization of `std::string`, which will store all short strings without any `malloc` at all. And use less than 32 bytes. – Bo Persson Mar 05 '17 at 15:01
  • @Paul Hankin, Please note that you're adding 4 extra bytes for each string. Multiply it by the enormous number of strings and the program runs out of memory. – Moav Mar 05 '17 at 15:34
  • 1
    @Moav You're wrong. Paul showed an `union`, not a `struct`. That union does not need a single bit more than the char pointer. – Daniel Jour Mar 05 '17 at 15:55
  • @DanielJour and @Paul, You're right. My bad. Some abbreviation of the `union` approach might be more compliance with the C standard. But still, the question about `malloc` boundaries remains. – Moav Mar 05 '17 at 16:46

2 Answers2

0

I've been working on the code of CLISP a bit. Something similar is done there to squeeze some immediate values into a pointer (instead of allocating an object from the garbage collected heap).

In the case of CLISP this works because we know the range of addresses that could be returned by our allocator, though. Then there's a bit that could never be set by an valid address, which if set indicates that the "pointer" is not an actual pointer but data (the single - or more - character in your case).

Btw: Using a char * is not a good plan, you're halfway to shooting yourself in the foot due to undefined behavior or just accidentally passing such a "pointer" to for example strlen. I guess using a union is a much better approach (Though I don't currently have time to check the actual rules from the standard about whether using an union in this way is even allowed). Even more safe is packing an uintptr_t inside a struct.

[..] but if someone here have some knowledge on the deep-end of malloc it might save me a lot of time.

That's not going to buy you anything. Switch operating system, platform or just the used standard library and everything could be different from what you learnt about the malloc implementation that you're currently looking at.

One approach is to use your own allocator - like done with CLISP - which is taking memory from some pool (for example acquired with mmap on Linux/BSD) which resides at some predefined address.

But you can also use malloc (or a better suitable function, like C11 aligned_alloc or posix_memalign) to allocate aligned memory for your strings. Let's say you align every string at an even address. That way when you see an address which has the least significant bit set, then you can be sure that it's not actually an address but rather immediate data, i.e. the string itself.

Using malloc only to allocate 2 byte aligned memory works so: You allocate 2 additional bytes. If the address returned by malloc is already correctly aligned, then you return the next correctly aligned address (char_ptr + 2) and mark the cell directly before that address with some value which indicates that the original address was aligned (char_ptr[1] = '1'). If, on the other hand, the returned address is not correctly aligned, then you return the directly following byte (which is correctly aligned; char_ptr + 1) and mark the cell directly before that address (thus, the first; char_ptr[0] = '0').

When freeing, look at the cell directly before the address that was passed in, it contains the mark to tell you which address you need to free.

In code:

#define IS_ALIGNED_2BYTE(value) (((uintptr_t)(value) & 0x01) == 0)

/// Allocate a memory region of the specified size at an even (2 byte
/// aligned) address.
///
/// \param[in] size Required size of the memory region.
/// \return Pointer to the memory, or NULL on failure.
inline static void * allocate_2byte_aligned(size_t size) {
#ifdef HAVE_ALIGNED_ALLOC
  return aligned_alloc(2, size);
#elif defined(HAVE_POSIX_MEMALIGN)
  void * ptr;
  if (posix_memalign(&ptr, sizeof(void *), size) == 0) {
    assert(IS_ALIGNED_2BYTE(ptr)); // Paranoia due to uncertainty
                                   // about alignment parameter to
                                   // posix_memalign.
    return ptr;
  } else {
    return NULL;
  }
#else
  char * const memory = malloc(size + 2);
  if (! memory) {
    return NULL;
  }
  if (IS_ALIGNED_2BYTE(memory)) {
    // memory is correctly aligned, but to distinguish from originally
    // not aligned addresses when freeing we need to have at least one
    // byte. Thus we return the next correctly aligned address and
    // leave a note in the byte directly preceeding that address.
    memory[1] = '1';
    return &(memory[2]);
  } else {
    // memory is not correctly aligned. Leave a note in the first byte
    // about this for freeing later and return the next (and correctly
    // aligned) address.
    memory[0] = '0';
    return &(memory[1]);
  }
#endif
}


/// Free memory previously allocated with allocate_2byte_aligned.
///
/// \param[in] ptr Pointer to the 2 byte aligned memory region.
inline static void free_2byte_aligned(void * ptr) {
  assert(IS_ALIGNED_2BYTE(ptr));
#if defined(HAVE_ALIGNED_ALLOC) || defined(HAVE_POSIX_MEMALIGN)
  free(ptr);
#else
  char const * const memory = ptr;
  void const * original_address;
  if (memory[-1] == '0') {
    // malloc returned an address that was not aligned when allocating
    // this memory block. Thus we left one byte unused and returned
    // the address of memory[1]. Now we need to undo this addition.
    original_address = &(memory[-1]);
  } else {
    // malloc returned an address that was aligned. We left two bytes
    // unused and need to undo that now.
    assert(memory[-1] == '1');
    original_address = &(memory[-2]);
  }
  free((void *) original_address);
#endif
}

Creating and destroying "pointer-or-immediate-data" structures is then straightforward:

typedef struct its_structure {
  uintptr_t data; ///< Either a pointer to the C string, or the actual
                  ///< string, together with a bit to indicate which
                  ///< of those it is.
} its;

its its_alloc(size_t size) {
  if (size < sizeof(uintptr_t)) {
    its const immediate_string = {.data = 0x01};
    return immediate_string;
  } else {
    void * const memory = allocate_2byte_aligned(size);
    assert(IS_ALIGNED_2BYTE(memory));
    its const allocated_string = {
      .data = memory ? (uintptr_t) memory : (0x01 | 0x02) /* Invalid string */};
    return allocated_string;
  }
}

void its_free(its string) {
  if (IS_ALIGNED_2BYTE(string.data)) {
    free_2byte_aligned((void *) string.data);
  } // else immediate, thus no action neccessary
}

Above code is actually from a small library I wrote to test/write this answer. If you want then use / enhance it.

Community
  • 1
  • 1
Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • I found a solution that works with standard malloc, I'll try to add it when I have time (in a couple of hours). – Daniel Jour Mar 05 '17 at 15:40
  • Still waiting for your solution – Moav Mar 06 '17 at 12:26
  • @Moav I didn't forget you, sorry for the delay though (I was ill and then had to work up other stuff). – Daniel Jour Mar 16 '17 at 21:48
  • Your code answers the question perfectly. You can store 31 or 63 bit of data on a pointer variable and distinctively know whether its a pointer or not. But after sleeping on your code I've decided to go the long way with a more traditional solution of sharing similar strings. I'll definitely tuck this trick up my sleeve for future use. – Moav Mar 28 '17 at 12:59
0

If you really have "many many strings" which are one character long, then at least one of the following must be true:

  1. Your characters are more than eight bits long

  2. You could benefit greatly by "interning" your strings so that you don't create more than one copy of the same string.

If your strings are immutable (or you can arrange for them to not be mutated in place), then #2 is a very attractive option. In the case that #1 is false and you only have 255 possible one-character strings, then you can intern simply by indexing into a prebuilt table of 510 bytes (alternating single characters and NULs). The more general interning strategy requires a hash table or some such, and is there more work (but potentially extremely valuable).

If your "characters" are actually short byte sequences but do not repeat often, then you might want to implement a simple pool allocation scheme. This will be easiest/most effective if there isn't much "churn" in your strings; that is, you don't frequently allocate and immediately free the string.

A simple string pool allocation scheme is to choose some cutoff and allocate all strings of that size sequentially into blocks which are large enough to hold many short strings. The sizes will depend on your precise application, but I've had some success with a model in which "short" strings are a maximum of 31 bytes and the chunk size is exactly 4096 bytes always allocated on a 4096-byte boundary (using posix_memalign, for example). The pool allocator maintains a single chunk, to which it appends newly-allocated strings. It also keeps a list of allocated chunks, each one with a count of active strings. pool_malloc(n) defers to malloc(n) if the string is longer than the cutoff; otherwise it puts the new string at the end of the currently active chunk after first allocating a new chunk if the current chunk is full. pool_free(s) checks if s is short enough to fit into a chunk. If not, it just calls free(), and if so finds the chunk in the list of active chunks and decrements its active string count. (It's easy to find the chunk because of the fact that chunks are all 4k-aligned.) There are lots of variations on this theme: you can put the metadata into the chunk itself rather than keeping external metadata; you can simply not free short strings until the entire pool is deallocated (which saves a lot of time at the expense of extra memory usage if there are lots of frees); you can make chunks of fixed-length strings which makes it easier to immediately free a string (but requires you to keep some kind of freelist structure.)

These two strategies can be used in combination; that is particularly effective if you can afford to free the entire intern table / memory pool in a single operation, which is often possible. (See, for example, the Apache Portable Runtime approach to memory allocation.)

rici
  • 234,347
  • 28
  • 237
  • 341