2

CPU and Data alignment In this question,Yann Ramin has said some cpus(ARM, or Intel SSE instructions) require aligned memory and have undefined operation when doing unaligned accesses (or throw an exception). If I used a memory pool that doesn't handle this situation, would it make my application worse?

Community
  • 1
  • 1
JoeyMiu
  • 139
  • 10

1 Answers1

2

Using unaligned memory will almost always be a bad idea, not only when it is explicitly unallowed. Even when it doesn't cause an error, if a memory access crosses a cache line boundary and in other cases (see Tony D's comment), it will be slower than if it didn't. Just use a memory pool that returns aligned addresses.

You can make a simple aligned memory pool by allocating a large chunk of bytes with new and breaking it into a given number of 4, 8, 16, ... byte blocks and returning the smallest fit. You could use a bitmap to keep track of which blocks are allocated. I should say this is inefficient. I assume this is more for fun/learning than production or you would just use new. Coding your own allocator isn't easy, you can look at implementations of malloc to see what I mean. It's a difficult balance between speed, space efficiency and fragmentation.

Tyler
  • 1,818
  • 2
  • 13
  • 22
  • 1
    +1 for lots of good points, but regarding "when it doesn't cause an error, if a memory access crosses a cache line" - it's not just cache line boundaries... even unaligned accesses that are inside a cache line can be a fraction of the usual speed while they orchestrate an operation across words. For example, reading a 32-bit `int` from addresses 1, 2, 3, 5, 6, 7 etc.. – Tony Delroy Jul 24 '14 at 06:19
  • You can make a simple aligned memory pool by allocating a large chunk of bytes with new and breaking it into a given number of 4, 8, 16, ... byte blocks and returning the smallest fit. If the cpu need 8byte aligned memory boundries, would it be unwise to return a memory address that doesnt't match – JoeyMiu Jul 24 '14 at 06:29
  • When you allocate a block from `new` it is already aligned and when you break it up like I describe, alignment is preserved. So if you return an 8 byte block, it will be 8 byte aligned. – Tyler Jul 24 '14 at 06:43
  • The last comment is only half true. When you allocate a block with `new`, it is aligned _to the allocation type's alignment_. Which, in the case of `char`, 1 byte. Of course in practice it's usually aligned to 8 bytes anyway (16 on some systems), but that is merely an implementation detail of the allocator. It is not something you may rely on. – Damon Jul 24 '14 at 10:16