How the magic bits are improving the strlen function in glibc

Question

I was going through the source of strlen for glibc. They have used magic bits to find the length of string. Can someone please explain how it is working. Thank you

Related: [Why does glibc's strlen need to be so complicated to run quickly?](https://stackoverflow.com/a/57676035) - it only uses that fallback C version on platforms where it doesn't have hand-written asm. For example on x86-64, it uses SSE2 or AVX2 `pcmpeb` to check 16 or 32 bytes at once. Also [How to determine if a byte is null in a word](https://stackoverflow.com/q/27322182) is another Q&A about the same glibc source. — Peter Cordes, Jun 08 '22 at 21:41

score 5 · Accepted Answer · answered Dec 26 '12 at 14:54

Let's say this function is looking through a string -- 4 bytes at a time, as explained by the comments (we assume long ints are 4 bytes) -- and the current "chunk" look like this:

'\3'     '\3'     '\0'     '\3'
00000011 00000011 00000000 00000011  (as a string: "\x03\x03\x00\x03")

The strlen function is just looking for the first zero byte in this string. It first determines, for each 4-byte chunk, whether there's any zero byte in there, by checking this magic_bits shortcut first: it adds the 4 bytes to this value:

01111110 11111110 11111110 11111111

Adding any non-zero bytes to this value will cause the 1's to overflow into the holes marked by zeroes, by propagating the carries. For our chunk, it'd look like this:

  11111111 111111          1 1111111   Carries
  00000011 00000011 00000000 00000011  Chunk
  01111110 11111110 11111110 11111111  Magic bits
+ -----------------------------------
  10000010 00000001 11111111 00000010
  ^      ^        ^        ^

(The hole bits are marked by ^'s.)

And, from the comments:

       /* Look at only the hole bits.  If any of the hole bits
          are unchanged, most likely one of the bytes was a
          zero.  */

If there's no zeroes in the chunk, all of the hole bits will get set to 1's. However, because of the zero byte, one hole bit didn't get filled by a propagating carry, and we can then go check which byte it was.

Essentially, it speeds up the strlen calculation by applying some bit addition magic to 4-byte chunks to scan for zeroes, before narrowing down the search to single byte comparisons.

`The one misfire occurs when bits 24-30 are clear and bit 31 is set;` can you explain what's the misfire? — yuan, Dec 27 '12 at 02:09
@ZhangYuan: the shortcut "misfires" when the magic bits test *thinks* there's a '\0' byte somwhere, but there isn't. Apparently this is related to one of the values being '\x80', or something. — Lynn, Dec 28 '12 at 21:11

ouah · Answer 2 · 2012-12-26T14:50:18.397

2

The idea is instead to compare one byte at a time against zero, rather to check one unsigned long object at a time if one of its byte is zero. This means checking 8 bytes at a time when sizeof (unsigned long) is 8.

With bit hacks, there is a fast known expression that can determine if one of the bytes compares equal to zero. Then if one of the bytes is equal to zero, the bytes of the object are individually tested to find the first one which is zero. The advantage of using bitwise operations is it reduces the number of branching instructions.

The bit hack expression to check if one of the byte of a multi-byte object is equal to zero is explained in the famous Stanford Bit Twiddling Hacks page, in

Determine if a word has a zero byte http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord

edited Dec 26 '12 at 14:50

answered Dec 26 '12 at 14:44

ouah

142,963
15
272
331

So [the glibc](http://tsunanet.net/~tsuna/strlen.c.html) implementation is faster than [the libc](https://www.student.cs.uwaterloo.ca/~cs350/common/os161-src-html/strlen_8c-source.html) implementation ? – Aakashdeep Dec 26 '12 at 16:01
@Aakashdeep - It also depends on how long your strings are. There is an investment in settings up the magic. How often does this pay off? – Bo Persson Dec 26 '12 at 21:28

How the magic bits are improving the strlen function in glibc

2 Answers2

Linked

Related