Rounding up to next power of 2

Question

I want to write a function that returns the nearest next power of 2 number. For example if my input is 789, the output should be 1024. Is there any way of achieving this without using any loops but just using some bitwise operators?

Related: Algorithm for finding the smallest power of two that's greater or equal to a given value is a C++ question. C++20 introduced std:bit_ceil which lets the compiler do whatever's optimal for the target system, but nothing equivalent is yet available in portable ISO C for bit-scan, popcount or other common bit operations that most CPUs have. Portable C code has to be less efficient and/or more complicated.

Given an integer, how do I find the next largest power of two using bit-twiddling? is a language-agnostic version of the question with some C++11 and 17 constexpr using GNU extensions.

Answers to this question don't need to be portable; fast versions for various platforms are useful.

They're multiple questions matching this one. For example: http://stackoverflow.com/questions/364985/algorithm-for-finding-the-smallest-power-of-two-thats-greater-or-equal-to-a-giv — Yann Droneaud, Mar 11 '13 at 11:49
By way of clarification, do you need the nearest power of 2 (ie. 65 would give you 64, but 100 would give you 128) or the nearest above (ie. 65 would give you 128, and so would 100)? — Kim Reece, Jan 21 '09 at 17:45
See here for possible solutions: [http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float](http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float) — Stefan, Jan 21 '09 at 17:31
possible duplicate of [Given an integer, how do I find the next largest power of two using bit-twiddling?](http://stackoverflow.com/questions/1322510/given-an-integer-how-do-i-find-the-next-largest-power-of-two-using-bit-twiddlin) — Nathan, Jan 16 '14 at 00:44
Or convert to Rust and use https://doc.rust-lang.org/stable/std/primitive.usize.html#method.next_power_of_two ;-) (Supposed to be in Hackers Delight section 3.2 too) — JGFMK, Aug 31 '20 at 18:49
@Nathan's linked thread is indeed posted later than the one here, but John Feminella's [answer](https://stackoverflow.com/a/1322548/3873510) in that thread is superb. Readers may want to take a look. — Paul Razvan Berg, Mar 25 '21 at 19:43
Heh. All the answers and comments here (and all the answers and comments at https://stackoverflow.com/questions/1322510/given-an-integer-how-do-i-find-the-next-largest-power-of-two-using-bit-twiddlin ) make the potentially wrong assumption that the integer is unsigned and provide example code that is very broken for negative integers. — Brendan, Jul 05 '21 at 01:00

score 211 · Accepted Answer · edited Feb 28 '23 at 04:47

211

Check the Bit Twiddling Hacks. You need to get the base 2 logarithm, then add 1 to that. Example for a 32-bit value:

Round up to the next highest power of 2

unsigned int v; // compute the next highest power of 2 of 32-bit v

v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;

The extension to other widths should be obvious.

An answer on Given an integer, how do I find the next largest power of two using bit-twiddling? presents some explanation of how it works, and examples of the bit-patterns for a couple inputs.

edited Feb 28 '23 at 04:47

Peter Cordes

328,167
45
605
847

answered Jan 21 '09 at 17:30

florin

13,986
6
46
47

26

This is not the most efficient solution because many processors have special instruction for counting leading zeros which can be used to compute log2 very efficiently. See https://en.wikipedia.org/wiki/Find_first_set – Simon Oct 04 '13 at 21:57
15

@Simon: it's the portable solution. There's no common efficient algorithm for all architectures – phuclv Dec 06 '13 at 09:27
9

What if the number itself is a power of two? – Litherum Dec 19 '13 at 19:34
@Litherum did you read the codes at bit twidding hacks? The power-of-2 case has already treated specifically – phuclv Dec 29 '13 at 01:42
1

@rishta the `^` operator in C isn't power. And that's the slowest solution – phuclv Jun 23 '14 at 04:04
1

Internet Archive to the rescue: https://web.archive.org/web/20160703165415/https://graphics.stanford.edu/~seander/bithacks.html This is still a lazy answer so I downvoted. – Ray Jul 04 '16 at 17:31
Why there is 5 times `v |= v >>` repeated? I tried it with 2 time and it also works, eg:`v--; console.log(v); v |= v >> 1; console.log(v); v |= v >> 2; console.log(v); v++;` Using latest Chrome on Debian 9 64bit – Marecky Jan 02 '18 at 00:40
Any chance we could get a macro version of this? – benathon Sep 12 '18 at 04:54
19

This thread is still well referenced but this answer (and most others) are highly outdated. CPUs have an instruction to help this (actually already at that time?). From : https://jameshfisher.com/2018/03/30/round-up-power-2.html `uint64_t next_pow2(uint64_t x) { return x == 1 ? 1 : 1<<(64-__builtin_clzl(x-1)); }` And for 32 bit : `uint32_t next_pow2(uint32_t x) { return x == 1 ? 1 : 1<<(32-__builtin_clz(x-1)); }` That is if you use GCC (and Clang I think?), but it would be wise to take the time to find the call to CLZ instead of copy-pasting all options around. – MappaM Mar 13 '19 at 11:13
7

@MappaM This answer is still highly relevant and the best portable way to do it. Your 64-bit version has undefined behaviour if `x > UINT32_MAX` and isn't branchless. Also, GCC and Clang use `-mtune=generic` by default (as do most distros), so your code will NOT expand to the `lzcnt` instruction on x86_64 -- it'll actually expand to something MUCH slower (a libgcc routine) unless you use something like `-march=native`. So your proposed replacement is non-portable, buggy and (typically) slower. – Craig Barnes Jul 07 '19 at 14:34
1

I'm not saying this is a full and definitive answer, or I would not have answered in comments. But a consideration for speed would be nice. On ubuntu, it works fine with default. Note the _clzl and _clz so it works for >UINT32_MAX too. – MappaM Jul 09 '19 at 09:11
4

@MappaM that sounds like a case of "works on my machine". For me, `next_pow2(1ULL << 34)` returns 0 and the reason has nothing to do with `_clzl` vs. `_clz`. While your function is trivial to fix, the point I'm trying to make is simple -- if you're going to call other people's code "highly outdated", you should first make sure your own code is tested and working and not just a buggy copypasta. – Craig Barnes Dec 01 '19 at 09:13
just a slightest note. No need to used post increment, decrement. – E_g Apr 01 '20 at 09:05
@florin I feel like it would be more accurate to quote Bit Twiddling Hacks here and say that what needs to be found is the "result that may be expressed by the formula 1U << (lg(v - 1) + 1)", which is pow(2, (lg(v - 1) + 1). Because the snippet in the answer doesn't really compute base 2 logarithm, [this does](http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious). – haykoandri Apr 13 '20 at 19:07
Waity-minty, what if `v` is unity? I trow this solution needs a guard: `if( v == 1 ) return 2;` – Anton Shepelev Jul 21 '20 at 23:13
@Ant_222 but 1 is a power of 2, it's just the zeroth power, i.e. `pow(2, 0) == 1`. special casing this isn't correct in the general case, even if it might be useful for your use case – Sam Mason Aug 20 '20 at 07:34
1

@CraigBarnes x86 had `bsr` to calculate floor log2 efficiently since **386**. GCC's `__builtin_clz` expands to `bsr` with `-mtune=generic`. You *should* use `__builtin_clz` to do this portably and efficiently. – xiver77 Jun 28 '22 at 15:31
@xiver77 Fair point regarding `bsr`, but `__builtin_clz` is not portable. Of course it should be used when available, but that's tangential to the point I was making. – Craig Barnes Jul 01 '22 at 11:55
Also has the advantage of working in higher-level languages, which some other answers do not. – ReflexiveCode Dec 20 '22 at 20:14
1

@MappaM C++20 actually added std::countl_zero, which would be the portable version of __builtin_clz – pnda Jan 17 '23 at 17:02

score 95 · Answer 2 · edited Dec 11 '13 at 01:34

95

next = pow(2, ceil(log(x)/log(2)));

This works by finding the number you'd have raise 2 by to get x (take the log of the number, and divide by the log of the desired base, see wikipedia for more). Then round that up with ceil to get the nearest whole number power.

This is a more general purpose (i.e. slower!) method than the bitwise methods linked elsewhere, but good to know the maths, eh?

edited Dec 11 '13 at 01:34

s4y

50,525
12
70
98

answered Jan 21 '09 at 17:35

Paul Dixon

295,876
54
310
348

4

From C99, you can also just use [log2](http://www.cprogramming.com/fod/log2.html) if supported by your tools. GCC and VS don't seem to :( – Matthew Read Jan 22 '12 at 05:49
2

You're missing a bracket... next=pow(2, ceil(log(x)/log(2))); – Matthieu Cormier Aug 17 '12 at 17:10
22

Be careful about float accuracy, though. `log(pow(2,29))/log(2)` = 29.000000000000004, so the result is 2**30 instead of returning 2**29. I think this is why log2 functions exist? – endolith Dec 09 '13 at 02:52
2

Faster is `powf(2, ceilf(log2f(val)))` – tjklemz Jan 06 '14 at 20:46
80

The cost of this is probably at least 200 cycles and it's not even correct. Why does this have so many upvotes? – Axel Gneiting Aug 18 '15 at 20:17
4

@SuperflyJon But it mentions bitwise operators and I assume correctness is implied by any question unless noted otherwise. – BlackJack May 02 '17 at 16:02
3

Please do update the answer & use log2(). I made a mistake by not reading the comments and ended up with a great loss in a CP contest. :( – Abhishek Apr 12 '20 at 16:15
3

Can combine logarithms with bit operators: 1 << ceil(log2(x)) – Matt Eding Apr 12 '20 at 23:34
The main problem with this is say you're using a 64-bit integer value for x and ceil/log2 use a 64-bit double then there is no way to calculate for all values of x. – CR. Aug 31 '21 at 18:16

score 75 · Answer 3 · edited Sep 20 '12 at 08:50

75

I think this works, too:

int power = 1;
while(power < x)
    power*=2;

And the answer is power.

edited Sep 20 '12 at 08:50

ЯegDwight

24,821
10
45
52

answered Sep 20 '12 at 04:46

Jose Dagoberto

799
6
2

32

Fair enough the question asked for no loops. But as clever as some of the other functions are, for code that is not performance sensitive the answer that is quickly and easily understood and verified to be correct always wins for me. – Tim MB Jan 17 '13 at 12:44
2

This is not returning the nearest power of 2, butthe power of that is immediatly bigger than X. Still very good – CoffeDeveloper Jun 09 '13 at 16:34
4

Instead of multiplying, some bitwise "magic" can be used instead `power <<= 1` – vallentin Jul 30 '16 at 08:01
6

@Vallentin That should be automatically optimized by a compiler. – MarkWeston Oct 05 '17 at 05:59
8

Beware of infinite loop if `x` is too large (i.e. not enough bits to represent the next power of 2). – alban Mar 22 '19 at 14:01
1

@MarkWeston https://godbolt.org/z/ajvGsTs5b Clang trunk ARM 64 bit does not appear to optimize the loop at the moment. – nmr Dec 15 '21 at 23:15

Eclipse · Answer 4 · 2011-10-06T22:16:15.297

54

unsigned long upper_power_of_two(unsigned long v)
{
    v--;
    v |= v >> 1;
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;
    v++;
    return v;

}

edited Oct 06 '11 at 22:16

answered Jan 21 '09 at 17:39

Eclipse

44,851
20
112
171

75

Would be nice if you have attributed it (unless you discovered it). It comes from the bit twiddling hacks page. – florin Jan 21 '09 at 17:47
3

Is that for a 32-bit number? Extension for 64-bit? – Jonathan Leffler Jan 21 '09 at 17:52
Jonathan, you need to do it for the upper half, and if that is zero, you do it for the lower half. – florin Jan 21 '09 at 18:03
6

@florin, if v is a 64-bit type, couldn't you just add a "c |= v >> 32" after the one for 16? – Evan Teran Jan 21 '09 at 18:18
Yup - it's from the bit-twiddling hacks as florin noted – Eclipse Jan 21 '09 at 19:29
2

Becareful when using signed int. Values > 0x4000_0000_0000_0000 will return 0x8000_0000_0000_0000. – Nathan Oct 18 '12 at 23:25
Input values > 0x8000_0000_0000_0000 will return 0. Input of 0 returns 0. – Nathan Oct 18 '12 at 23:26
@Nathan `printf("%lx\n", upper_power_of_two(0x8000000000000000));` --> `8000000000000000` when `upper_power_of_two()` uses 64-bit `unsigned long`. – chux - Reinstate Monica Oct 26 '15 at 17:29
I found some application in JDK ArrayDeque, but it doesn't have the first statement `v--` I tested a few cases, seems okay, not sure what's the difference – zinking Aug 30 '16 at 08:21
@zinking I know this is an old comment, but I was curious, and after trying it it seems that without the decrement if the input is power of two, then the output is double the input, whereas with the decrement, a power of two is returned unchanged. – Ryan1729 Apr 18 '18 at 03:43
4

Code that only works for a specific bit width should be using fixed-width types instead of minimum-width types. This function should take and return a `uint32_t`. – Craig Barnes Sep 21 '18 at 21:38
If you are using DPDK, [rte_common.h](https://github.com/DPDK/dpdk/blob/main/lib/eal/include/rte_common.h) has helper functions which use this mechanism such as `rte_align32pow2` which aligns to the next power of 2. – Chester Gillon Dec 20 '22 at 12:05

score 47 · Answer 5 · edited May 23 '17 at 12:26

47

If you're using GCC, you might want to have a look at Optimizing the next_pow2() function by Lockless Inc.. This page describes a way to use built-in function builtin_clz() (count leading zero) and later use directly x86 (ia32) assembler instruction bsr (bit scan reverse), just like it's described in another answer's link to gamedev site. This code might be faster than those described in previous answer.

By the way, if you're not going to use assembler instruction and 64bit data type, you can use this

/**
 * return the smallest power of two value
 * greater than x
 *
 * Input range:  [2..2147483648]
 * Output range: [2..2147483648]
 *
 */
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
    assert(x > 1);
    assert(x <= ((UINT32_MAX/2) + 1));
#endif

    return 1 << (32 - __builtin_clz (x - 1));
}

edited May 23 '17 at 12:26

Community

1
1

answered Apr 13 '12 at 14:59

Yann Droneaud

5,277
1
23
39

3

Note that this returns the smallest power of 2 greater than OR equal to x. Changing (x -1) to x changes the function to return the smaller power of 2 greater than x. – Guillaume Jul 28 '13 at 18:50
3

You can use `_BitScanForward` on Visual C++ – KindDragon Dec 05 '13 at 10:25
You can also use `__builtin_ctz()` – MarkP Mar 31 '16 at 18:21
@MarkP `__builtin_ctz()` won't be useful to round any non power of 2 number up to the next power of two – Yann Droneaud Apr 01 '16 at 12:42
3

Please add in your answer a link to Wikipedia list of builtin bitwise functions for other compilers: https://en.wikipedia.org/wiki/Find_first_set#Tool_and_library_support Please provide also a 64-bits version. I propose the following C++11 function: `constexpr uint64_t nextPowerOfTwo64 (uint64_t x) { return 1ULL<<(sizeof(uint64_t) * 8 - __builtin_clzll(x)); }` – oHo Apr 09 '17 at 02:50
8 bits per byte is usually fine but `CHAR_BIT` is more accurate. – Gareth A. Lloyd Jun 12 '18 at 12:35
This looks quite strange and does't look portable, no offense. – Jul 18 '20 at 02:35
@oHo: If you're using C++, C++20 provides [`std::countl_zero`](https://en.cppreference.com/w/cpp/numeric/countl_zero) = CLZ and `std::bit_width` (index of MSB + 1 = x86 BSR+1) to finally provide a portable way for the language to expose something that most CPUs have been providing for years. In fact, C++20 even has [`std::bit_ceil`](https://en.cppreference.com/w/cpp/numeric/bit_ceil) which is the answer to this whole question. But this is a C question, and ISO C is still stuck in the dark ages in this regard, necessitating GNU extensions or other intrinsics to do it efficiently. – Peter Cordes Mar 03 '22 at 03:07
@oHo: Also if you're using C++, use `std::numeric_limits` to get the number of bits in a type. `uint64_t` is specifically guaranteed to have 64, but it's an optional type. CHAR_BIT shouldn't be assumed to be 8 if you're going for max portability, e.g. to DSPs, but `std::numeric_limits::digits` makes it easy. – Peter Cordes Mar 03 '22 at 03:09

score 25 · Answer 6 · answered Apr 06 '21 at 19:50

25

In standard c++20 this is included in <bit>. The answer is simply

#include <bit>
unsigned long upper_power_of_two(unsigned long v)
{
    return std::bit_ceil(v);
}

NOTE: The solution I gave is for c++, not c, I would give an answer this question instead, but it was closed as a duplicate of this one!

answered Apr 06 '21 at 19:50

Jake Schmidt

1,558
9
16

2

I agree that _"solve this in C"_ is **not** a duplicate of _"solve this in C++"_. You are demonstrating that. So I have [reopened the duplicate you link to](https://stackoverflow.com/questions/364985/algorithm-for-finding-the-smallest-power-of-two-thats-greater-or-equal-to-a-giv). Please feel free to move your C++ answer there. – Drew Dormann Feb 24 '23 at 19:54
I've added a link to the C++ question to this C question (and even a mention of `std::bit_ceil`), so this no longer needs to be an answer here. Please delete. Your answer here incorrectly states that the linked question is closed; as Drew says it's been reopened. – Peter Cordes Feb 28 '23 at 04:52

vlk · Answer 7 · 2018-08-21T06:05:12.310

23

One more, although I use cycle, but thi is much faster than math operands

power of two "floor" option:

int power = 1;
while (x >>= 1) power <<= 1;

power of two "ceil" option:

int power = 2;
x--;    // <<-- UPDATED
while (x >>= 1) power <<= 1;

UPDATE

As mentioned in comments there was mistake in ceil where its result was wrong.

Here are full functions:

unsigned power_floor(unsigned x) {
    int power = 1;
    while (x >>= 1) power <<= 1;
    return power;
}

unsigned power_ceil(unsigned x) {
    if (x <= 1) return 1;
    int power = 2;
    x--;
    while (x >>= 1) power <<= 1;
    return power;
}

edited Aug 21 '18 at 06:05

answered Jul 19 '14 at 20:46

vlk

2,581
3
31
35

4

the result is not correct if `x` is power of 2. A micro to test if input is power of 2 is needed. `#define ISPOW2(x) ((x) > 0 && !((x) & (x-1)))` – pgplus1628 Oct 28 '15 at 13:05
@zorksylar more efficiently would be to `if (x == 0) return 1; /* Or 0 (Which is what I use) */ x--; /* Rest of program */` – yyny Mar 03 '16 at 16:45
Good solution! but the `power of two "ceil" option` is not correct. For example, when `x = 2` the result should be `2` instead of `4` – MZD Aug 18 '18 at 12:02

score 14 · Answer 8 · edited Jun 01 '17 at 21:48

14

For any unsigned type, building on the Bit Twiddling Hacks:

#include <climits>
#include <type_traits>

template <typename UnsignedType>
UnsignedType round_up_to_power_of_2(UnsignedType v) {
  static_assert(std::is_unsigned<UnsignedType>::value, "Only works for unsigned types");
  v--;
  for (size_t i = 1; i < sizeof(v) * CHAR_BIT; i *= 2) //Prefer size_t "Warning comparison between signed and unsigned integer"
  {
    v |= v >> i;
  }
  return ++v;
}

There isn't really a loop there as the compiler knows at compile time the number of iterations.

edited Jun 01 '17 at 21:48

user877329

6,717
8
46
88

answered Jan 13 '15 at 05:31

robson

413
4
8

4

Note that the question is about C. – martinkunev Apr 07 '17 at 12:47
1

@martinkunev Just replace UnsignedType and process it manually. I am pretty sure a C programmer can expand this simple template ignoring the `std::is_unsigned::value` assertion. – user877329 Jun 01 '17 at 17:53
3

@user877329 Sure, it would be nice to have an answer in Javascript as well, just in case somebody wants to translate that to C. – martinkunev Jun 01 '17 at 18:10
@martinkunev UnsignedType in JavaScript? Anyway, this solution shows how to do it for any UnsignedType, and it happen to be written in C++, rather than pseudocode [ sizeof(v)*CHAR_BIT instead of something like number of bits in an object of UnsignedType ]. – user877329 Jun 01 '17 at 18:46

score 13 · Answer 9 · answered Jan 24 '19 at 07:24

13

Despite the question is tagged as c here my five cents. Lucky us, C++ 20 would include std::ceil2 and std::floor2 (see here). It is consexpr template functions, current GCC implementation uses bitshifting and works with any integral unsigned type.

answered Jan 24 '19 at 07:24

kreuzerkrieg

3,009
3
28
59

8

They recently renamed it to `bit_ceil` http://open-std.org/JTC1/SC22/WG21/docs/papers/2020/p1956r1.pdf – Wolfgang Brehm Mar 28 '20 at 11:30
5

`bit_floor` and `bit_ceil` are now available in C++20 from the `` header. https://en.cppreference.com/w/cpp/header/bit – SU3 Dec 01 '20 at 07:07

Jasper Bekkers · Answer 10 · 2009-01-21T18:36:59.677

10

For IEEE floats you'd be able to do something like this.

int next_power_of_two(float a_F){
    int f = *(int*)&a_F;
    int b = f << 9 != 0; // If we're a power of two this is 0, otherwise this is 1

    f >>= 23; // remove factional part of floating point number
    f -= 127; // subtract 127 (the bias) from the exponent

    // adds one to the exponent if were not a power of two, 
    // then raises our new exponent to the power of two again.
    return (1 << (f + b)); 
}

If you need an integer solution and you're able to use inline assembly, BSR will give you the log2 of an integer on the x86. It counts how many right bits are set, which is exactly equal to the log2 of that number. Other processors have similar instructions (often), such as CLZ and depending on your compiler there might be an intrinsic available to do the work for you.

edited Jan 21 '09 at 18:36

answered Jan 21 '09 at 18:15

Jasper Bekkers

6,711
32
46

This is an interesting one eventhough not related to the question ( I want to roundoff only integers), will try out this one.. – Naveen Jan 21 '09 at 18:29
Came up with it after reading the wikipedia article on floats. Besides that, I've used it to calculate square-roots in integer precision. Also nice, but even more unrelated. – Jasper Bekkers Jan 21 '09 at 18:32
This breaks the strict aliasing rules. On some compilers it may not work or issue a warning. – martinkunev Dec 01 '17 at 16:54

Johan · Answer 11 · 2022-02-15T11:12:47.017

7

In x86 you can use the sse4 bit manipulation instructions to make it fast.

//assume input is in eax
mov    ecx,31      
popcnt edx,eax   //cycle 1
lzcnt  eax,eax   //cycle 2
sub    ecx,eax
mov    eax,1
cmp    edx,1     //cycle 3
jle @done        //cycle 4 - popcnt says its a power of 2, return input unchanged
shl    eax,cl    //cycle 5
@done: rep ret   //cycle 5

In c you can use the matching intrinsics.

Or jumpless, which speeds up things by avoiding a misprediction due to a jump, but slows things down by lengthening the dependency chain. Time the code to see which works best for you.

//assume input is in eax
mov    ecx,31
popcnt edx,eax    //cycle 1
lzcnt  eax,eax
sub    ecx,eax
mov    eax,1      //cycle 2
cmp    edx,1
mov    edx,0     //cycle 3 
cmovle ecx,edx   //cycle 4 - ensure eax does not change
shl    eax,cl    
@done: rep ret   //cycle 5

edited Feb 15 '22 at 11:12

answered Mar 31 '16 at 15:49

Johan

74,508
24
191
319

Perhaps I'm missing something, but it doesn't look like this will work. It essentially left-shifts the value 2 by the count of leading zeros in the original number i.e. `2< – DBear Dec 13 '21 at 21:39
You should try and avoid the branch: replace it with `mov edx,0; cmovle ecx,edx` – chqrlie Feb 15 '22 at 08:56
Most CPUs with BMI1 also have BMI2, so you can use `shlx` for single-uop variable-count shifts on Intel if you don't mind excluding some old AMD CPUs. Also, if `popcnt(x) <= 1` is just to detect already being a power of 2, use the ZF output of [BMI1 `blsr dummy, eax`](https://github.com/HJLebbink/asm-dude/wiki/blsr) - if clearing the lowest set bit made the result zero, the input was a power of 2, or zero. It's single-uop on Intel, 2 uops on AMD before Zen 4. (https://uops.info/) – Peter Cordes Feb 28 '23 at 05:48

score 7 · Answer 12 · answered Oct 29 '18 at 07:32

7

Here's my solution in C. Hope this helps!

int next_power_of_two(int n) {
    int i = 0;
    for (--n; n > 0; n >>= 1) {
        i++;
    }
    return 1 << i;
}

answered Oct 29 '18 at 07:32

Kevin Yang

71
1
1

Philip J. Fry · Answer 13 · 2013-12-28T23:11:35.160

/*
** http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog
*/
#define __LOG2A(s) ((s &0xffffffff00000000) ? (32 +__LOG2B(s >>32)): (__LOG2B(s)))
#define __LOG2B(s) ((s &0xffff0000)         ? (16 +__LOG2C(s >>16)): (__LOG2C(s)))
#define __LOG2C(s) ((s &0xff00)             ? (8  +__LOG2D(s >>8)) : (__LOG2D(s)))
#define __LOG2D(s) ((s &0xf0)               ? (4  +__LOG2E(s >>4)) : (__LOG2E(s)))
#define __LOG2E(s) ((s &0xc)                ? (2  +__LOG2F(s >>2)) : (__LOG2F(s)))
#define __LOG2F(s) ((s &0x2)                ? (1)                  : (0))

#define LOG2_UINT64 __LOG2A
#define LOG2_UINT32 __LOG2B
#define LOG2_UINT16 __LOG2C
#define LOG2_UINT8  __LOG2D

static inline uint64_t
next_power_of_2(uint64_t i)
{
#if defined(__GNUC__)
    return 1UL <<(1 +(63 -__builtin_clzl(i -1)));
#else
    i =i -1;
    i =LOG2_UINT64(i);
    return 1UL <<(1 +i);
#endif
}

If you do not want to venture into the realm of undefined behaviour the input value must be between 1 and 2^63. The macro is also useful to set constant at compile time.

This is the probably the worst solution (it's also missing ULL suffix on the 64-bit constant). This will generate 32 tests per input in all cases. Better to use a while loop, it'll always be faster or at the same speed. — xryl669, Apr 29 '16 at 14:33
BUT... this can be evaluated by the preprocessor if the input is a constant, and thus ZERO operation at run time! — Michael, Sep 13 '19 at 16:53

score 4 · Answer 14 · answered Dec 29 '13 at 00:24

4

For completeness here is a floating-point implementation in bog-standard C.

double next_power_of_two(double value) {
    int exp;
    if(frexp(value, &exp) == 0.5) {
        // Omit this case to round precise powers of two up to the *next* power
        return value;
    }
    return ldexp(1.0, exp);
}

answered Dec 29 '13 at 00:24

doynax

4,285
3
23
19

1

Random browsers, if you read this comment, choose this code.This is clearly the best answer, no special instructions, no bit twiddling, just efficient, portable and standard code. Guessing why no one else upvoted it ^^ – CoffeDeveloper Oct 25 '15 at 21:36
9

Random browsers, this is going to be dead slow if you don't have specialized floating point hardware. On x86 you can run circles around this code using bit twiddling. `rep bsr ecx,eax; mov eax,0; cmovnz eax,2; shl eax,cl` is about 25x faster. – Johan Mar 31 '16 at 15:31

NoelC · Answer 15 · 2018-09-01T02:24:30.523

An efficient Microsoft (e.g., Visual Studio 2017) specific solution in C / C++ for integer input. Handles the case of the input exactly matching a power of two value by decrementing before checking the location of the most significant 1 bit.

inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
    unsigned long Index;
    _BitScanReverse(&Index, Value - 1);
    return (1U << (Index + 1));
}

// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#if defined(WIN64) // The _BitScanReverse64 intrinsic is only available for 64 bit builds because it depends on x64

inline unsigned long long ExpandToPowerOf2(unsigned long long Value)
{
    unsigned long Index;
    _BitScanReverse64(&Index, Value - 1);
    return (1ULL << (Index + 1));
}

#endif

This generates 5 or so inlined instructions for an Intel processor similar to the following:

dec eax
bsr rcx, rax
inc ecx
mov eax, 1
shl rax, cl

Apparently the Visual Studio C++ compiler isn't coded to optimize this for compile-time values, but it's not like there are a whole lot of instructions there.

Edit:

If you want an input value of 1 to yield 1 (2 to the zeroth power), a small modification to the above code still generates straight through instructions with no branch.

inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
    unsigned long Index;
    _BitScanReverse(&Index, --Value);
    if (Value == 0)
        Index = (unsigned long) -1;
    return (1U << (Index + 1));
}

Generates just a few more instructions. The trick is that Index can be replaced by a test followed by a cmove instruction.

A small mistake: It should return 1 for 1, it does not though. — 0kcats, Aug 30 '18 at 18:00
Thanks. In the application for which it was developed we explicitly needed 2 to the first power when 1 is input. 1 could be taken as a special case with a conditional without generating too many more instructions I imagine. — NoelC, Aug 31 '18 at 23:30
Updated the answer to include a version that returns 1 for an input value of 1. — NoelC, Sep 01 '18 at 02:53

Explorer09 · Answer 16 · 2021-07-05T04:48:38.237

Trying to make an "ultimate" solution for this. The following code

is targeted for C language (not C++),
uses compiler built-ins to yield efficient code (CLZ or BSR instruction) if compiler supports any,
is portable (standard C and no assembly) with the exception of built-ins, and
addresses all undefined behaviors.

If you're writing in C++, you may adjust the code appropriately. Note that C++20 introduces std::bit_ceil which does the exact same thing except the behavior may be undefined on certain conditions.

#include <limits.h>

#ifdef _MSC_VER
# if _MSC_VER >= 1400
/* _BitScanReverse is introduced in Visual C++ 2005 and requires
   <intrin.h> (also introduced in Visual C++ 2005). */
#include <intrin.h>
#pragma intrinsic(_BitScanReverse)
#pragma intrinsic(_BitScanReverse64)
#  define HAVE_BITSCANREVERSE 1
# endif
#endif

/* Macro indicating that the compiler supports __builtin_clz().
   The name HAVE_BUILTIN_CLZ seems to be the most common, but in some
   projects HAVE__BUILTIN_CLZ is used instead. */
#ifdef __has_builtin
# if __has_builtin(__builtin_clz)
#  define HAVE_BUILTIN_CLZ 1
# endif
#elif defined(__GNUC__)
# if (__GNUC__ > 3)
#  define HAVE_BUILTIN_CLZ 1
# elif defined(__GNUC_MINOR__)
#  if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
#   define HAVE_BUILTIN_CLZ 1
#  endif
# endif
#endif

/**
 * Returns the smallest power of two that is not smaller than x.
 */
unsigned long int next_power_of_2_long(unsigned long int x)
{
    if (x <= 1) {
        return 1;
    }
    x--;

#ifdef HAVE_BITSCANREVERSE
    if (x > (ULONG_MAX >> 1)) {
        return 0;
    } else {
        unsigned long int index;
        (void) _BitScanReverse(&index, x);
        return (1UL << (index + 1));
    }
#elif defined(HAVE_BUILTIN_CLZ)
    if (x > (ULONG_MAX >> 1)) {
        return 0;
    }
    return (1UL << (sizeof(x) * CHAR_BIT - __builtin_clzl(x)));
#else
    /* Solution from "Bit Twiddling Hacks"
       <http://www.graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2>
       but converted to a loop for smaller code size.
       ("gcc -O3" will unroll this.) */
    {
        unsigned int shift;
        for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
            x |= (x >> shift);
        }
    }
    return (x + 1);
#endif
}

unsigned int next_power_of_2(unsigned int x)
{
    if (x <= 1) {
        return 1;
    }
    x--;

#ifdef HAVE_BITSCANREVERSE
    if (x > (UINT_MAX >> 1)) {
        return 0;
    } else {
        unsigned long int index;
        (void) _BitScanReverse(&index, x);
        return (1U << (index + 1));
    }
#elif defined(HAVE_BUILTIN_CLZ)
    if (x > (UINT_MAX >> 1)) {
        return 0;
    }
    return (1U << (sizeof(x) * CHAR_BIT - __builtin_clz(x)));
#else
    {
        unsigned int shift;
        for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
            x |= (x >> shift);
        }
    }
    return (x + 1);
#endif
}

unsigned long long next_power_of_2_long_long(unsigned long long x)
{
    if (x <= 1) {
        return 1;
    }
    x--;

#if (defined(HAVE_BITSCANREVERSE) && \
    ULLONG_MAX == 18446744073709551615ULL)
    if (x > (ULLONG_MAX >> 1)) {
        return 0;
    } else {
        /* assert(sizeof(__int64) == sizeof(long long)); */
        unsigned long int index;
        (void) _BitScanReverse64(&index, x);
        return (1ULL << (index + 1));
    }
#elif defined(HAVE_BUILTIN_CLZ)
    if (x > (ULLONG_MAX >> 1)) {
        return 0;
    }
    return (1ULL << (sizeof(x) * CHAR_BIT - __builtin_clzll(x)));
#else
    {
        unsigned int shift;
        for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
            x |= (x >> shift);
        }
    }
    return (x + 1);
#endif
}

Ryan · Answer 17 · 2021-08-24T05:44:03.553

Portable solution in C#:

int GetNextPowerOfTwo(int input) {
    return 1 << (int)Math.Ceiling(Math.Log2(input));
}

Math.Ceiling(Math.Log2(value)) calculates the exponent of the next power of two, the 1 << calculates the real value through bitshifting.

Faster solution if you have .NET Core 3 or above:

uint GetNextPowerOfTwoFaster(uint input) {
    return (uint)1 << (sizeof(uint) * 8 - System.Numerics.BitOperations.LeadingZeroCount(input - 1));
}

This uses System.Numerics.BitOperations.LeadingZeroCount() which uses a hardware instruction if available:

https://github.com/dotnet/corert/blob/master/src/System.Private.CoreLib/shared/System/Numerics/BitOperations.cs

Update:

RoundUpToPowerOf2() is Coming in .NET 6! The internal implementation is mostly the same as the .NET Core 3 solution above.

Here's the community update.

score 1 · Answer 18 · answered Oct 16 '19 at 11:00

1

You might find the following clarification to be helpful towards your purpose:

answered Oct 16 '19 at 11:00

eigenaus

79
1
4

Timur Usmanov · Answer 19 · 2022-02-15T08:53:43.717

constexpr version of clp2 for C++14

#include <iostream>
#include <type_traits>

// Closest least power of 2 minus 1. Returns 0 if n = 0.
template <typename UInt, std::enable_if_t<std::is_unsigned<UInt>::value,int> = 0>
  constexpr UInt clp2m1(UInt n, unsigned i = 1) noexcept
    { return i < sizeof(UInt) * 8 ? clp2m1(UInt(n | (n >> i)),i << 1) : n; }

/// Closest least power of 2 minus 1. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value && std::is_signed<Int>::value,int> = 0>
  constexpr auto clp2m1(Int n) noexcept
    { return clp2m1(std::make_unsigned_t<Int>(n <= 0 ? 0 : n)); }

/// Closest least power of 2. Returns 2^N: 2^(N-1) < n <= 2^N. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
  constexpr auto clp2(Int n) noexcept
    { return clp2m1(std::make_unsigned_t<Int>(n-1)) + 1; }

/// Next power of 2. Returns 2^N: 2^(N-1) <= n < 2^N. Returns 1 if n = 0. Returns 0 if n < 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
  constexpr auto np2(Int n) noexcept
    { return clp2m1(std::make_unsigned_t<Int>(n)) + 1; }

template <typename T>
  void test(T v) { std::cout << clp2(v) << std::endl; }

int main()
{
    test(-5);                          // 0
    test(0);                           // 0
    test(8);                           // 8
    test(31);                          // 32
    test(33);                          // 64
    test(789);                         // 1024
    test(char(260));                   // 4
    test(unsigned(-1) - 1);            // 0
    test<long long>(unsigned(-1) - 1); // 4294967296

    return 0;
}

score 0 · Answer 20 · answered Oct 04 '13 at 21:52

0

Many processor architectures support log base 2 or very similar operation – count leading zeros. Many compilers have intrinsics for it. See https://en.wikipedia.org/wiki/Find_first_set

answered Oct 04 '13 at 21:52

Simon

3,224
3
23
17

it's not about finding the highest set bit (=bsr) or counting leading zeros. he wants to *round up* to nearest power of 2. the answer with "subtract 1, then do bsr and shift 1 left" does that. – Flo Oct 13 '18 at 17:23

score 0 · Answer 21 · answered Aug 19 '16 at 02:10

Assuming you have a good compiler & it can do the bit twiddling before hand thats above me at this point, but anyway this works!!!

    // http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
    #define SH1(v)  ((v-1) | ((v-1) >> 1))            // accidently came up w/ this...
    #define SH2(v)  ((v) | ((v) >> 2))
    #define SH4(v)  ((v) | ((v) >> 4))
    #define SH8(v)  ((v) | ((v) >> 8))
    #define SH16(v) ((v) | ((v) >> 16))
    #define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))         

    #define CB0(v)   ((v) - (((v) >> 1) & 0x55555555))
    #define CB1(v)   (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
    #define CB2(v)   ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
    #define CBSET(v) (CB2(CB1(CB0((v)))))
    #define FLOG2(v) (CBSET(OP(v)))

Test code below:

#include <iostream>

using namespace std;

// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v)  ((v-1) | ((v-1) >> 1))  // accidently guess this...
#define SH2(v)  ((v) | ((v) >> 2))
#define SH4(v)  ((v) | ((v) >> 4))
#define SH8(v)  ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))         

#define CB0(v)   ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v)   (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v)   ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v))) 

#define SZ4         FLOG2(4)
#define SZ6         FLOG2(6)
#define SZ7         FLOG2(7)
#define SZ8         FLOG2(8) 
#define SZ9         FLOG2(9)
#define SZ16        FLOG2(16)
#define SZ17        FLOG2(17)
#define SZ127       FLOG2(127)
#define SZ1023      FLOG2(1023)
#define SZ1024      FLOG2(1024)
#define SZ2_17      FLOG2((1ul << 17))  // 
#define SZ_LOG2     FLOG2(SZ)

#define DBG_PRINT(x) do { std::printf("Line:%-4d" "  %10s = %-10d\n", __LINE__, #x, x); } while(0);

uint32_t arrTble[FLOG2(63)];

int main(){
    int8_t n;

    DBG_PRINT(SZ4);    
    DBG_PRINT(SZ6);    
    DBG_PRINT(SZ7);    
    DBG_PRINT(SZ8);    
    DBG_PRINT(SZ9); 
    DBG_PRINT(SZ16);
    DBG_PRINT(SZ17);
    DBG_PRINT(SZ127);
    DBG_PRINT(SZ1023);
    DBG_PRINT(SZ1024);
    DBG_PRINT(SZ2_17);

    return(0);
}

Outputs:

Line:39           SZ4 = 2
Line:40           SZ6 = 3
Line:41           SZ7 = 3
Line:42           SZ8 = 3
Line:43           SZ9 = 4
Line:44          SZ16 = 4
Line:45          SZ17 = 5
Line:46         SZ127 = 7
Line:47        SZ1023 = 10
Line:48        SZ1024 = 10
Line:49        SZ2_16 = 17

score 0 · Answer 22 · edited May 10 '17 at 17:39

I'm trying to get nearest lower power of 2 and made this function. May it help you.Just multiplied nearest lower number times 2 to get nearest upper power of 2

int nearest_upper_power(int number){
    int temp=number;
    while((number&(number-1))!=0){
        temp<<=1;
        number&=temp;
    }
    //Here number is closest lower power 
    number*=2;
    return number;
}

score 0 · Answer 23 · answered Sep 25 '18 at 08:01

0

Adapted Paul Dixon's answer to Excel, this works perfectly.

 =POWER(2,CEILING.MATH(LOG(A1)/LOG(2)))

answered Sep 25 '18 at 08:01

Tim Tharratt

1,251
1
10
18

score 0 · Answer 24 · answered Oct 09 '18 at 16:39

A variant of @YannDroneaud answer valid for x==1, only for x86 plateforms, compilers, gcc or clang:

__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
    assert(x > 0);
    assert(x <= ((UINT32_MAX/2) + 1));
#endif
  int clz;
  uint32_t xm1 = x-1;
  asm(
    "lzcnt %1,%0"
    :"=r" (clz)
    :"rm" (xm1)
    :"cc"
    );
    return 1 << (32 - clz);
}

score 0 · Answer 25 · answered Feb 15 '19 at 20:19

Here is what I'm using to have this be a constant expression, if the input is a constant expression.

#define uptopow2_0(v) ((v) - 1)
#define uptopow2_1(v) (uptopow2_0(v) | uptopow2_0(v) >> 1)
#define uptopow2_2(v) (uptopow2_1(v) | uptopow2_1(v) >> 2)
#define uptopow2_3(v) (uptopow2_2(v) | uptopow2_2(v) >> 4)
#define uptopow2_4(v) (uptopow2_3(v) | uptopow2_3(v) >> 8)
#define uptopow2_5(v) (uptopow2_4(v) | uptopow2_4(v) >> 16)

#define uptopow2(v) (uptopow2_5(v) + 1)  /* this is the one programmer uses */

So for instance, an expression like:

uptopow2(sizeof (struct foo))

will nicely reduce to a constant.

score 0 · Answer 26 · answered Apr 26 '21 at 02:49

0

The g++ compiler provides a builtin function __builtin_clz that counts leading zeros:

So we could do:

int nextPowerOfTwo(unsigned int x) {
  return 1 << sizeof(x)*8 - __builtin_clz(x);
}

int main () {
  std::cout << nextPowerOfTwo(7)  << std::endl;
  std::cout << nextPowerOfTwo(31) << std::endl;
  std::cout << nextPowerOfTwo(33) << std::endl;
  std::cout << nextPowerOfTwo(8)  << std::endl;
  std::cout << nextPowerOfTwo(91) << std::endl;
  
  return 0;
}

Results:

But note that, for x == 0, __builtin_clz return is undefined.

answered Apr 26 '21 at 02:49

lnogueir

1,859
2
10
21

1

There's another undefined behavior in your code. When `x > (UINT_MAX / 2)`, `__builtin_clz` would return 0 causing the left shift (`<<`) operator to overflow. Also, you should use `1U`, not 1, as the left operand of `<<`, otherwise your defined domain will be smaller (`(INT_MAX / 2)` instead of `(UINT_MAX / 2)`). – Explorer09 Jul 17 '21 at 22:26
1

Returning "16" for the input "8" is not really the right answer. – Ian Rehwinkel May 11 '22 at 20:33

score -1 · Answer 27 · answered Jan 21 '09 at 17:40

-1

If you need it for OpenGL related stuff:

/* Compute the nearest power of 2 number that is 
 * less than or equal to the value passed in. 
 */
static GLuint 
nearestPower( GLuint value )
{
    int i = 1;

    if (value == 0) return -1;      /* Error! */
    for (;;) {
         if (value == 1) return i;
         else if (value == 3) return i*4;
         value >>= 1; i *= 2;
    }
}

answered Jan 21 '09 at 17:40

Paulo Lopes

5,845
22
31

1

florin: it is. and it is used as a loop here, isn't it? – Tamas Czinege Jan 21 '09 at 17:56
9

DrJokepu - I think florin meant to say here that the OP asked for a loop-less solution – Eli Bendersky Nov 08 '09 at 05:56

David Wallace · Answer 28 · 2019-10-24T08:08:34.697

-1

Convert it to a float and then use .hex() which shows the normalized IEEE representation.

>>> float(789).hex() '0x1.8a80000000000p+9'

Then just extract the exponent and add 1.

>>> int(float(789).hex().split('p+')[1]) + 1 10

And raise 2 to this power.

>>> 2 ** (int(float(789).hex().split('p+')[1]) + 1) 1024

edited Oct 24 '19 at 08:08

answered Oct 24 '19 at 08:02

David Wallace

212
4
12

1

Note this answer is in python – David Wallace Oct 29 '19 at 18:43

score -1 · Answer 29 · answered Oct 22 '21 at 10:00

-1

from math import ceil, log2
pot_ceil = lambda N: 0x1 << ceil(log2(N))

Test:

for i in range(10):
    print(i, pot_ceil(i))

Output:

answered Oct 22 '21 at 10:00

P i

29,020
36
159
267

score -2 · Answer 30 · answered Feb 25 '20 at 01:55

import sys


def is_power2(x):
    return x > 0 and ((x & (x - 1)) == 0)


def find_nearest_power2(x):
    if x <= 0:
        raise ValueError("invalid input")
    if is_power2(x):
        return x
    else:
        bits = get_bits(x)
        upper = 1 << (bits)
        lower = 1 << (bits - 1)
        mid = (upper + lower) // 2
        if (x - mid) > 0:
            return upper
        else:
            return lower


def get_bits(x):
    """return number of bits in binary representation"""
    if x < 0:
        raise ValueError("invalid input: input should be positive integer")
    count = 0
    while (x != 0):
        try:
            x = x >> 1
        except TypeError as error:
            print(error, "input should be of type integer")
            sys.exit(1)
        count += 1
    return count

score -2 · Answer 31 · answered May 01 '20 at 08:56

-2

If you want an one-line-template. Here it is

int nxt_po2(int n) { return 1 + (n|=(n|=(n|=(n|=(n|=(n-=1)>>1)>>2)>>4)>>8)>>16); }

or

int nxt_po2(int n) { return 1 + (n|=(n|=(n|=(n|=(n|=(n-=1)>>(1<<0))>>(1<<1))>>(1<<2))>>(1<<3))>>(1<<4)); }

answered May 01 '20 at 08:56

Vo Hoang Anh

21
2

1

This is undefined behaviour in C or C++ and will lead to errors. Modifying `n` multiple times without a sequence point is invalid. You wrote it as if `n-=1` should happen first but the only guarantee here is that `n` contains its new value after the `;` and the parentheses do not change that. – sam hocevar May 19 '20 at 12:18

Rounding up to next power of 2

31 Answers31

Linked

Related