15

Why do C programmers often allocate strings (char arrays) in powers of two?

You often see...

char str[128]
char str[512]
char str[2048]

Less often, you see...

char str[100]
char str[500]
char str[2000]

Why is that?

I understand the answer will involve memory being addressed in binary... But why don't we often see char str[384], which is 128+256 (multiple of two).

Why are multiples of two not used? Why do C programmers use powers of two?

Ricardo Magalhães Cruz
  • 3,504
  • 6
  • 33
  • 57
  • 7
    It's just a common habit, there's no strong reason. – Barmar Jan 13 '18 at 00:35
  • 4
    Progammers lust like powers of 2. There are rare times where it actually matters (like if you have a ring buffer and want to handle wrap by masking an index). But usually it's not really important. – Michael Burr Jan 13 '18 at 00:36
  • 2
    It does have real use sometimes: like ring buffers in embedded code. It lets you wrap the index without branching. – Lee Daniel Crocker Jan 13 '18 at 00:37
  • Well, any halfway decent programmers knows the first powers of two, and most of the limits for data-types by heart. Also, bit-ops naturally use powers of 2. There are lots more multiples of those, and they are rarer, which contributes to them staying rarer. – Deduplicator Jan 13 '18 at 00:39
  • It also conveniently divides common page sizes. If your buffer's boundaries line up with the page boundaries of the underlying memory backing, it can be a performance benefit. – William Pursell Jan 13 '18 at 00:48
  • @WilliamPursell But there's no reason to expect the buffer to be aligned on a page boundary. – Barmar Jan 13 '18 at 00:49
  • 3
    Some processors (like ARM) [can operate with powers of two more easily than other constants](http://www.davespace.co.uk/arm/introduction-to-arm/immediates.html). – Raymond Chen Jan 13 '18 at 01:12
  • Most of the time, it's because programmer's are obsessed with powers of 2. In a few cases, it really is specific to the application. – lurker Jan 13 '18 at 01:20
  • In many cases it's just habit. There are other common sizes you will see as well, such as `char buf[80]; // I don't know why I like 80, but it's a number that has been around a long time` and `char numstr[20]; // that must be big enough for an integer string, right?` – Michael Geary Jan 13 '18 at 07:24
  • Let's say you needed a field with a max of 970 characters... Many programmers would round it to 1024. Programmer superstition. Many others would round it to 1000--that's non-programmer superstition. For a programmer one is just as natural as the other--or as natural as picking the number you actually need, 970. There is probably a little bit of an Easter egg feeling in there as well--mixed with some programming history and a touch of habit. Superstition. – Bill K Jan 13 '18 at 08:06
  • @MichaelGeary 80 was a common line length for hardware terminals for decades. – Russell Borogove Jan 13 '18 at 14:07
  • @RussellBorogove Indeed, I learned to program in 1968 on Teletypes and punch cards, both 80 columns of course. What I find funny about the popular `char buf[80];` is that (assuming a zero terminated string) it _won't hold 80 characters!_ – Michael Geary Jan 13 '18 at 18:05

5 Answers5

10

There is no good reason for it anymore except for some very rare cases.

To debunk the most common argument: It helps the memory allocator to avoid fragmentation.

Most often it will not. If you allocate - lets say - 256 bytes the memory allocator will add some additional space for it's internal management and house-keeping. So your allocation is internally larger. Two 256 buffers have the same size as a 512 byte buffer? Not true.

For peformance it is may even doing harm because how the CPU caches work.

Lets say you need N buffers of some size you may declare them this way:

char buffer[N][256];

Now each buffer[0] to buffer[N-1] have identical least significant bits in their address, and these bits are used to allocate cache-lines. The first bytes of your buffers all occupy the same place in your CPU cache.

If you do calculations of the first few bytes of each buffer over and over again you won't see much acceleration from your first level cache.

If on the other hand you would declare them like this:

char buffer[N][300];

The individual buffers don't have identical least significant bits in the address and the first level cache can fully use it.

Lots of people have already run into this issue, for example see this question here: Matrix multiplication: Small difference in matrix size, large difference in timings

There are a few legitimate use-cases for power-of-two buffer sizes. If you write your own memory allocator for example you want to manage your raw memory in sizes equal to the operation system page size. Or you may have hardware constraints that force you to use power-of-two numbers (GPU textures etc).

ilkkachu
  • 6,221
  • 16
  • 30
Nils Pipenbrinck
  • 83,631
  • 31
  • 151
  • 221
  • The overhead argument is true for many memory managers. But some memory managers don't allocate their metadata inline (jemalloc being an example), so they should work well with power-of-two allocations. – CodesInChaos Jan 13 '18 at 08:17
  • @CodesInChaos, which is actually a very good point: to decide if the powers-of-two make sense, we need to know the internals of the allocator. For ones with inline metadata, it might be safer to allocate something like powers-of-two minus a constant, like 112 or 1008... but that's also assuming a lot. – ilkkachu Jan 13 '18 at 12:52
  • Your memory allocator argument is unfortunately inaccurate. Homework memory allocators are generally implemented as a linked list (free-list), but the current crop of allocator generally use off-line metadata due to the inefficiency you pointed out. Furthermore, memory allocators only come into play when allocating memory dynamically; in the cases above, the variables are either on the stack or global. – Matthieu M. Jan 13 '18 at 12:55
  • @MatthieuM. really? Just did two 256 allocations on a vanilla Linux Ubuntu box. The pointers are 272 bytes apart. Version of glib is (Ubuntu GLIBC 2.23-0ubuntu9) 2.23 – Nils Pipenbrinck Jan 13 '18 at 14:19
  • @NilsPipenbrinck: *cringe*. That's unfortunate. See jemalloc for a modern (~2011?) allocator design; from [this article](https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919/): *Small objects are grouped together, with additional metadata at the start of each page run, whereas large objects are independent of each other, and their metadata reside entirely in the arena chunk header.* Keeping meta-data apart really help with satisfying larger than `void*` alignments (such as for SIMD) and keep meta-data packed in cache. – Matthieu M. Jan 13 '18 at 14:30
7

An interesting question. Blocks of sizes 2^k fits better when OS memory management uses Buddy memory allocation technique. This technique deals with fragmentation of allocations. https://en.wikipedia.org/wiki/Buddy_memory_allocation

This allocation system does alignment of block to size power of 2. But this is used for heap allocation.

int * array = (int*) malloc(sizeof(int)*512); // OS manages heap memory allocation

When buffer is allocated on stack, there is no needs to make block alignment.

int buffer[512]; // stack allocation

I think no reason to make sizes of powers of 2.

Tomáš Dejmek
  • 151
  • 1
  • 4
4

This is to minimize the number of tiny blocks of memory that are too small to use for anything but need to be walked when the program allocates or deallocates memory. A classic explanation from Joel Spolsky’s blog, all the way back in 2001:

Smart programmers minimize the potential distruption of malloc by always allocating blocks of memory that are powers of 2 in size. You know, 4 bytes, 8 bytes, 16 bytes, 18446744073709551616 bytes, etc. For reasons that should be intuitive to anyone who plays with Lego, this minimizes the amount of weird fragmentation that goes on in the free chain. Although it may seem like this wastes space, it is also easy to see how it never wastes more than 50% of the space. So your program uses no more than twice as much memory as it needs to, which is not that big a deal.

There were plenty of other discussions of memory-heap implementations before then, including by Donald Knuth in The Art of Computer Programming. Not everyone will necessarily agree with that advice, but that is why people do it.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • Let's think carefully here... Say that the number of characters I can place in a block is **n** which is a power of 2 (computers being binary and all). Then you should use a *multiple* of **n**. If **n=16**, then you should use 16, 32, 64 for your string. It does not follow that you should use a *power* of 2. (Unless you don't know the true value of **n**, in which case you could use a *power* of 2 just to cover your bases.) – Ricardo Magalhães Cruz Jan 13 '18 at 15:59
  • @RicardoCruz One advantage of doing it with powers of 2 is that you'll need at most ⌈ lg *N* ⌉ resizes to fit a string of length *N*, rather than O(*N*). Another's that you won't fill a block of size 3· *b* with an allocation of size 2· *b* and leave a tiny hole no one else can use; any hole will be another power of 2. I'm not really interested in arguing over this approach, but those are some of the reasons people do it. – Davislor Jan 13 '18 at 16:15
1

The system itself uses powers of 2 to set various limits. For example maximum allocation for the length of file name can be 256, or 32768. Disk page size is powers of 2, etc.

We often have to keep these system restrictions in mind, and use the same powers of 2.

But if you only need 257 bytes, don't over allocate 512 bytes. Some programmers use powers of 2 to set limits for user input. This can be confusing to the user. It had some benefits in older computers, but not now.

Other times we use allocations which are randomly large. For example we might use 1000 or 1024 to read one line of text, because we don't know how long the input is. This is bad programming either way. It really doesn't matter if allocation is 1000 or 1024 in this case.

Barmak Shemirani
  • 30,904
  • 6
  • 40
  • 77
0

I doubt there is much reason to do this on desktop-class computers any more. For embedded devices where there are more extreme memory and performance limitations then powers of two can allow some extra optimisations.

Often operations such as multiplication are expensive on these devices, so replacing multiplication with bit shifts is an easy way to gain extra performance. Bounds checking can also be ignored in some cases, such as when an 8 bit index is used on an array of size 256.

user2248702
  • 2,741
  • 7
  • 41
  • 69