The man page of bzero
states that for various security reasons, it is deprecated and memset
should be used instead.
They mainly refer on the issue, that bzero
or explicit_bzero
cannot find all copies of given data (especially data which is small enough to fit into a register) and maybe does not completely erase or overwrite as expected.
But memset
only takes in a pointer address. How should memset
be able to find all copies to close this lack of security?

- 888
- 10
- 21
-
Please see [Why use bzero over memset?](https://stackoverflow.com/questions/17096990/why-use-bzero-over-memset) – Weather Vane Dec 15 '18 at 12:03
-
@WeatherVane I saw this question already but the answers aren't referring to the security issue. And the man page basically states that it is deprecated because of security lacks. But I don't see how `memset` should provide it instead ^^' – Lavair Dec 15 '18 at 12:14
-
Most optimizers treat memset like an intrinsic, replacing it with a few fast MOVs where applicable. I suspect the security issue came up because bzero() was used in a lot of legacy networking code. Defeated when compilers got smart about inlining code, thus requiring the explicit_bzero() workaround. – Hans Passant Dec 15 '18 at 12:33
-
but explicit_bzero() has still similar security issues. When memset isn't a solution for this and bzero() became only deprecated because it was used in legacy networking code, what do they use now to erase data in networking code? – Lavair Dec 15 '18 at 12:35
1 Answers
I think you've misread the man page. Assuming that you're talking about the Linux man page, it claims (correctly) that explicit_bzero
and memset_explicit
and memset_s
are more secure (for certain purposes) than memset
and bzero
. It doesn't claim any security difference between memset
and bzero
. The reason bzero
is deprecated is that it's a trivial wrapper around memset
and all¹ C implementations have memset
, so programmers might as well use memset
.
The difference between memset
/bzero
and the explicit_
/_s
variants is that compilers are forbidden from optimizing away the explicit variants. This makes the explicit variants suitable for scrubbing confidential data. For example, consider the following program snippet:
bzero(password, password_length);
free(password);
With just bzero
or memset
, many modern compilers see “oh, you're writing to memory and then freeing it. There's no way to read back what bzero
just wrote, so the call to bzero
is equivalent to doing nothing. Doing nothing is faster than calling bzero
, therefore I'll generate no code for the bzero
call.”
The flaw in the compiler's reasoning is that the reason for zeroing the memory is not the defined behavior of your program, but what happens in case of subsequent undefined or unspecified behavior. From the point of view of a C compiler, undefined behavior means anything can happen. From the point of view of a security engineer, exactly what happens on undefined behavior such as a buffer overflow or a use-after-free is very important. Likewise, exactly what is read back from uninitialized memory is unspecified, but important to a security engineer. Security engineers try to reduce the security impact of such undefined or unspecified behavior.
So for a security engineer that optimization of memset
is unfortunate. A security engineer wants to guarantee that when memory has been freed, its former contents will not leak out, even, say, due to a buffer overflow. Hence explicit_bzero
: compilers are instructed to treat the content of the target memory when this function returns as observable, so they aren't allowed to optimize the call away on the basis that the program isn't reading back from it. Semantically, explicit_bzero(buffer, length)
is equivalent to
bzero(buffer, length);
for (size_t i = 0; i < length; i++) __observe__(buffer[i]);
where __observe__
has no effect but nonetheless depends on the value of its argument. The compiler is therefore not allowed to remove the call to bzero
, because then __observe__
would not read back the correct values.
Explicit zeroing has limitations. The man page highlights that it won't scrub copies of a variable in registers, but this is usually not a big concern because it's rare for a buffer overflow or a read from uninitialized memory to end up leaking register values. The biggest limitation in practice is realloc
. When you use dynamically allocated memory, realloc
may move it, and there's no way to scrub the old value. For this reason, if the contents of a buffer is sensitive, you must not use realloc
on it.
Another limitation of explicit zeroing is that it only applies at the program level, not at the system level. Copies of the data may remain in caches, in swap, etc. The goal of zeroing memory inside your program is to protect from a security breach inside your program. It doesn't protect from larger system compromises.
Note that writing your own explicit_bzero
is impossible to do portably. The best you can do is to make it work with a finite set of versions of a finite set of compilers, with no guarantee that the next version won't have a fancier optimizer that sees through your attempt. That's why C11 added it as a standard function with memset_s
.
¹ Almost. Technically freestanding implementations don't have to have memset
but it's such a simple and useful function that most do, and it's usually provided by the compiler so it's available even when building without the usual C runtime.

- 104,111
- 38
- 209
- 254