28

Is there a way to align a pointer in C? Suppose I'm writing data to an array stack (so the pointer goes downward) and I want the next data I write to be 4-aligned so the data is written at a memory location which is a multiple of 4, how would I do that?

I have

 uint8_t ary[1024];
 ary = ary+1024;
 ary -= /* ... */

Now suppose that ary points at location 0x05. I want it to point to 0x04. Now I could just do

ary -= (ary % 4);

but C doesn't allow modulo on pointers. Is there any solution that is architecture independent?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Mark
  • 501
  • 2
  • 5
  • 5
  • @templatetypedef: I'd be interested to see the reference in the C++ standard where it says that `long` can hold a pointer. I believe that your belief is mistaken, but I'm willing to be proved wrong. – Jonathan Leffler Jan 30 '11 at 01:12
  • @Jonathan Leffler- Looks like you're right and that pointers needn't fit into a long! I've been operating under this assumption for the longest time... I wonder why I first thought that? – templatetypedef Jan 30 '11 at 01:23
  • 1
    @templatetypedef: because on most systems, you can get away with that assumption, even though the standard(s) do not guarantee it. Both ILP32 and LP64 (and ILP64 systems, if you can still find one - DEC Alpha was in that category) work OK. The only prevalent system where that does not hold is Windows 64 - an LLP64 system. – Jonathan Leffler Jan 30 '11 at 01:26
  • 2
    @JonathanLeffler It *was* required (by implication) by C89. Microsoft forced through a change in C99 to make it not required, over basically everyone else's objections, and then didn't implement C99. Yes, I'm still bitter. – zwol Oct 25 '12 at 23:50

6 Answers6

55

Arrays are NOT pointers, despite anything you may have read in misguided answers here (meaning this question in particular or Stack Overflow in general — or anywhere else).

You cannot alter the value represented by the name of an array as shown.

What is confusing, perhaps, is that if ary is a function parameter, it will appear that you can adjust the array:

void function(uint8_t ary[1024])
{
    ary += 213; // No problem because ary is a uint8_t pointer, not an array
    ...
}

Arrays as parameters to functions are different from arrays defined either outside a function or inside a function.

You can do:

uint8_t    ary[1024];
uint8_t   *stack = ary + 510;
uintptr_t  addr  = (uintptr_t)stack;

if (addr % 8 != 0)
    addr += 8 - addr % 8;
stack = (uint8_t *)addr;

This ensures that the value in stack is aligned on an 8-byte boundary, rounded up. Your question asks for rounding down to a 4-byte boundary, so the code changes to:

if (addr % 4 != 0)
    addr -= addr % 4;
stack = (uint8_t *)addr;

Yes, you can do that with bit masks too. Either:

addr = (addr + (8 - 1)) & -8;  // Round up to 8-byte boundary

or:

addr &= -4;                    // Round down to a 4-byte boundary

This only works correctly if the LHS is a power of two — not for arbitrary values. The code with modulus operations will work correctly for any (positive) modulus.

See also: How to allocate aligned memory using only the standard library.


Demo code

Gnzlbg commented:

The code for a power of two breaks if I try to align e.g. uintptr_t(2) up to a 1 byte boundary (both are powers of 2: 2^1 and 2^0). The result is 1 but should be 2 since 2 is already aligned to a 1 byte boundary.

This code demonstrates that the alignment code is OK — as long as you interpret the comments just above correctly (now clarified by the 'either or' words separating the bit masking operations; I got caught when first checking the code).

The alignment functions could be written more compactly, especially without the assertions, but the compiler will optimize to produce the same code from what is written and what could be written. Some of the assertions could be made more stringent, too. And maybe the test function should print out the base address of the stack before doing anything else.

The code could, and maybe should, check that there won't be numeric overflow or underflow with the arithmetic. This would be more likely a problem if you aligned addresses to a multi-megabyte boundary; while you keep under 1 KiB, alignments, you're unlikely to find a problem if you're not attempting to go out of bounds of the arrays you have access to. (Strictly, even if you do multi-megabyte alignments, you won't run into trouble if the result will be within the range of memory allocated to the array you're manipulating.)

#include <assert.h>
#include <stdint.h>
#include <stdio.h>

/*
** Because the test code works with pointers to functions, the inline
** function qualifier is moot.  In 'real' code using the functions, the
** inline might be useful.
*/

/* Align upwards - arithmetic mode (hence _a) */
static inline uint8_t *align_upwards_a(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    if (addr % align != 0)
        addr += align - addr % align;
    assert(addr >= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align upwards - bit mask mode (hence _b) */
static inline uint8_t *align_upwards_b(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr = (addr + (align - 1)) & -align;   // Round up to align-byte boundary
    assert(addr >= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align downwards - arithmetic mode (hence _a) */
static inline uint8_t *align_downwards_a(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr -= addr % align;
    assert(addr <= (uintptr_t)stack);
    return (uint8_t *)addr;
}

/* Align downwards - bit mask mode (hence _b) */
static inline uint8_t *align_downwards_b(uint8_t *stack, uintptr_t align)
{
    assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
    assert(stack != 0);

    uintptr_t addr  = (uintptr_t)stack;
    addr &= -align;                         // Round down to align-byte boundary
    assert(addr <= (uintptr_t)stack);
    return (uint8_t *)addr;
}

static inline int inc_mod(int x, int n)
{
    assert(x >= 0 && x < n);
    if (++x >= n)
        x = 0;
    return x;
}

typedef uint8_t *(*Aligner)(uint8_t *addr, uintptr_t align);

static void test_aligners(const char *tag, Aligner align_a, Aligner align_b)
{
    const int align[] = { 64, 32, 16, 8, 4, 2, 1 };
    enum { NUM_ALIGN = sizeof(align) / sizeof(align[0]) };
    uint8_t stack[1024];
    uint8_t *sp = stack + sizeof(stack);
    int dec = 1;
    int a_idx = 0;

    printf("%s\n", tag);
    while (sp > stack)
    {
        sp -= dec++;
        uint8_t *sp_a = (*align_a)(sp, align[a_idx]);
        uint8_t *sp_b = (*align_b)(sp, align[a_idx]);
        printf("old %p, adj %.2d, A %p, B %p\n",
               (void *)sp, align[a_idx], (void *)sp_a, (void *)sp_b);
        assert(sp_a == sp_b);
        sp = sp_a;
        a_idx = inc_mod(a_idx, NUM_ALIGN);
    }
    putchar('\n');
}

int main(void)
{
    test_aligners("Align upwards", align_upwards_a, align_upwards_b);
    test_aligners("Align downwards", align_downwards_a, align_downwards_b);
    return 0;
}

Sample output (partially truncated):

Align upwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4be, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bd, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bc, adj 08, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bb, adj 04, A 0x7fff5ebcf4bc, B 0x7fff5ebcf4bc
old 0x7fff5ebcf4b6, adj 02, A 0x7fff5ebcf4b6, B 0x7fff5ebcf4b6
old 0x7fff5ebcf4af, adj 01, A 0x7fff5ebcf4af, B 0x7fff5ebcf4af
old 0x7fff5ebcf4a7, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b7, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b6, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b5, adj 08, A 0x7fff5ebcf4b8, B 0x7fff5ebcf4b8
old 0x7fff5ebcf4ac, adj 04, A 0x7fff5ebcf4ac, B 0x7fff5ebcf4ac
old 0x7fff5ebcf49f, adj 02, A 0x7fff5ebcf4a0, B 0x7fff5ebcf4a0
old 0x7fff5ebcf492, adj 01, A 0x7fff5ebcf492, B 0x7fff5ebcf492
…
old 0x7fff5ebcf0fb, adj 08, A 0x7fff5ebcf100, B 0x7fff5ebcf100
old 0x7fff5ebcf0ca, adj 04, A 0x7fff5ebcf0cc, B 0x7fff5ebcf0cc
old 0x7fff5ebcf095, adj 02, A 0x7fff5ebcf096, B 0x7fff5ebcf096

Align downwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf480, B 0x7fff5ebcf480
old 0x7fff5ebcf47e, adj 32, A 0x7fff5ebcf460, B 0x7fff5ebcf460
old 0x7fff5ebcf45d, adj 16, A 0x7fff5ebcf450, B 0x7fff5ebcf450
old 0x7fff5ebcf44c, adj 08, A 0x7fff5ebcf448, B 0x7fff5ebcf448
old 0x7fff5ebcf443, adj 04, A 0x7fff5ebcf440, B 0x7fff5ebcf440
old 0x7fff5ebcf43a, adj 02, A 0x7fff5ebcf43a, B 0x7fff5ebcf43a
old 0x7fff5ebcf433, adj 01, A 0x7fff5ebcf433, B 0x7fff5ebcf433
old 0x7fff5ebcf42b, adj 64, A 0x7fff5ebcf400, B 0x7fff5ebcf400
old 0x7fff5ebcf3f7, adj 32, A 0x7fff5ebcf3e0, B 0x7fff5ebcf3e0
old 0x7fff5ebcf3d6, adj 16, A 0x7fff5ebcf3d0, B 0x7fff5ebcf3d0
old 0x7fff5ebcf3c5, adj 08, A 0x7fff5ebcf3c0, B 0x7fff5ebcf3c0
old 0x7fff5ebcf3b4, adj 04, A 0x7fff5ebcf3b4, B 0x7fff5ebcf3b4
old 0x7fff5ebcf3a7, adj 02, A 0x7fff5ebcf3a6, B 0x7fff5ebcf3a6
old 0x7fff5ebcf398, adj 01, A 0x7fff5ebcf398, B 0x7fff5ebcf398
…
old 0x7fff5ebcf0f7, adj 01, A 0x7fff5ebcf0f7, B 0x7fff5ebcf0f7
old 0x7fff5ebcf0d3, adj 64, A 0x7fff5ebcf0c0, B 0x7fff5ebcf0c0
old 0x7fff5ebcf09b, adj 32, A 0x7fff5ebcf080, B 0x7fff5ebcf080
Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Doesn't this code break down when you want to align to something else then power of 2? But I don't know if you would ever want to do that :D – tom Mar 05 '14 at 17:56
  • 1
    @tom: yes, this code assumes that you'd want to align to a power of 2 (so it does break if you needed something else). I've never heard of a system requiring anything else (for example, a 6-byte alignment becomes equivalent to a 2-byte alignment when all is said and done). – Jonathan Leffler Mar 05 '14 at 18:14
  • @JonathanLeffler the code for a power of two breaks if I try to align e.g. `uintptr_t(2)` up to a 1 byte boundary (both are powers of 2: 2^1 and 2^0). The result is 1 but should be 2 since 2 is already aligned to a 1 byte boundary. – gnzlbg Dec 23 '15 at 11:55
  • @gnzlbg: There's no point in aligning to a 1-byte boundary (or a 0-byte boundary, if such a thing can be said to make any sense, which I don't think it can). On modern machines with byte addresses (as opposed to older machines which sometimes had word addresses and required extra chicanery to handle bytes), there's no address that is not already aligned on a 1-byte boundary, so there's no point in computing one. However, regardless of the need for it, the code shown works for powers of 2 from 1 .. 64 (see demo code), and should be OK for larger alignments, subject to no overflow (not checked). – Jonathan Leffler Dec 23 '15 at 15:56
  • Maybe I got caught with the same "either or" issue as @JonathanLeffler . What I ended up doing in case somebody finds it useful is to `auto align_up(Integer x, size_t a) { return x + (a - 1) & ~(a - 1); }` and `auto align_down(Integer x, size_t a) { return self & ~(alignment - 1); }` which work for non-power-of-2 x and power-of-2 a. – gnzlbg Dec 23 '15 at 16:15
  • @JonathanLeffler shouldn't `int arr[]` in function parameters be changed to `int * const arr` instead of just `int *arr` ? – Ajay Brahmakshatriya Apr 13 '17 at 06:34
  • @AjayBrahmakshatriya: There's no particular reason why you need to convert `int arr[]` to `const int *arr` or `int *const arr` in a function. You can do it if you wish, as long as you know what the `const` qualifies (hint: it isn't the same thing in the two alternatives I give). It depends on what you're after. In general, there's no point in making the pointer in the function constant (it won't affect the calling code whether it is changed or not), and if the array was treated as non-const, there's no reason to make the data that the pointer points at const either. – Jonathan Leffler Apr 13 '17 at 06:38
  • It would be wrong to convert it to `const int *arr` since the array isn't declared as `const int arr[]` . But `int * const arr` would make sense since it was declared as an array and not a pointer. – Ajay Brahmakshatriya Apr 13 '17 at 06:41
  • You can do it if you like — but there's no need. – Jonathan Leffler Apr 13 '17 at 06:41
  • For rounding up to the next block size of 8 boundary, what if we did `addr = (addr + 8 - 1) / 8 * 8` instead of `addr += 8 - addr % 8`? – user5965026 Aug 24 '21 at 14:52
3

DO NOT USE MODULO!!! IT IS REALLY SLOW!!! Hands down the fastest way to align a pointer is to use 2's complement math. You need to invert the bits, add one, and mask off the 2 (for 32-bit) or 3 (for 64-bit) least significant bits. The result is an offset that you then add to the pointer value to align it. Works great for 32 and 64-bit numbers. For 16-bit alignment just mask the pointer with 0x1 and add that value. Algorithm works identically in any language but as you can see, Embedded C++ is vastly superior than C in every way shape and form.

#include <cstdint>
/** Returns the number to add to align the given pointer to a 8, 16, 32, or 64-bit 
    boundary.
    @author Cale McCollough.
    @param  ptr The address to align.
    @return The offset to add to the ptr to align it. */
template<typename T>
inline uintptr_t MemoryAlignOffset (const void* ptr) {
    return ((~reinterpret_cast<uintptr_t> (ptr)) + 1) & (sizeof (T) - 1);
}

/** Word aligns the given byte pointer up in addresses.
    @author Cale McCollough.
    @param ptr Pointer to align.
    @return Next word aligned up pointer. */
template<typename T>
inline T* MemoryAlign (T* ptr) {
    uintptr_t offset = MemoryAlignOffset<uintptr_t> (ptr);
    char* aligned_ptr = reinterpret_cast<char*> (ptr) + offset;
    return reinterpret_cast<T*> (aligned_ptr);
}

For detailed write up and proofs please @see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Fastest-Method-to-Align-Pointers. If you would like to see proof of why you should never use modulo, I invented the world fastest integer-to-string algorithm. The benchmark on the paper shows you the effect of optimizing away just one modulo instruction. Please @see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Engineering-a-Faster-Integer-to-String-Algorithm.

Graph of why you shouldn't use modulo

  • 13
    Compilers will optimize modulo operations into bitwise operations if the operands are unsigned integers and the modulus is a power of 2: https://gcc.godbolt.org/z/6tVTfN – kirbyfan64sos Nov 22 '19 at 22:16
2

For some reason I can't use modulo or bitwise operations. In this case:

void *alignAddress = (void*)((((intptr_t)address + align - 1) / align) * align) ;

For C++:

template <int align, typename T>
constexpr T padding(T value)
{
    return ((value + align - 1) / align) * align;
}
...
char* alignAddress = reinterpret_cast<char*>(padding<8>(reinterpret_cast<uintptr_t>(address)))
mr NAE
  • 3,144
  • 1
  • 15
  • 35
1

I'm editing this answer because:

  1. I had a bug in my original code (I forgot a typecast to intptr_t), and
  2. I'm replying to Jonathan Leffler's criticism in order to clarify my intent.

The code below is not meant to imply you can change the value of an array (foo). But you can get an aligned pointer into that array, and this example illustrates one way to do it.

#define         alignmentBytes              ( 1 << 2 )   // == 4, but enforces the idea that that alignmentBytes should be a power of two
#define         alignmentBytesMinusOne      ( alignmentBytes - 1 )

uint8_t         foo[ 1024 + alignmentBytesMinusOne ];
uint8_t         *fooAligned;

fooAligned = (uint8_t *)((intptr_t)( foo + alignmentBytesMinusOne ) & ~alignmentBytesMinusOne);
par
  • 17,361
  • 4
  • 65
  • 80
1

Based on tricks learned elsewhere and one from reading @par answer apparently all I needed for my special case which is for a 32-bit like machine is ((size - 1) | 3) + 1 which acts like this and thought might be useful for other,

for (size_t size = 0; size < 20; ++size) printf("%d\n", ((size - 1) | 3) + 1);

0
4
4
4
4
8
8
8
8
12
12
12
12
16
16
16
16
20
20
20
Ebrahim Byagowi
  • 10,338
  • 4
  • 70
  • 81
1

I'm using it to align pointers in C :

#include <inttypes.h>
static inline void * please_align(void * ptr){
    char * res __attribute__((aligned(128))) ;
    res = (char *)ptr + (128 - (uintptr_t) ptr) % 128;
    return res ;
}
Michel
  • 259
  • 2
  • 3