Is it possible to guarantee code doing memory writes is not optimized away in C++?

Question

C++ compilers are allowed to optimize away writes into memory:

 {
     //all this code can be eliminated
     char buffer[size];
     std::fill_n( buffer, size, 0);
 }

When dealing with sensitive data the typical approach is using volatile* pointers to ensure that memory writes are emitted by the compiler. Here's how SecureZeroMemory() function in Visual C++ runtime library is implemented (WinNT.h):

FORCEINLINE PVOID RtlSecureZeroMemory(
     __in_bcount(cnt) PVOID ptr, __in SIZE_T cnt )
{
    volatile char *vptr = (volatile char *)ptr;
#if defined(_M_AMD64)
    __stosb((PBYTE )((DWORD64)vptr), 0, cnt);
#else
    while (cnt) {
        *vptr = 0;
        vptr++;
        cnt--;
    }
#endif
    return ptr;
}

The function casts the passed pointer to a volatile* pointer and then writes through the latter. However if I use it on a local variable:

char buffer[size];
SecureZeroMemory( buffer, size );

the variable itself is not volatile. So according to C++ Standard definition of observable behavior writes into buffer don't count as observable behavior and looks like it can be optimized away.

Now there're a lot of comments below about page files, caches, etc, which are all valid, but let's just ignore them in this question. The only thing this question is about is whether the code for memory writes is optimized away or not.

Is it possible to ensure that code doing writes into memory is not optimized away in C++? Is the solution in SecureZeroMemory() compliant to C++ Standard?

@BigBoss: what if you don't want all future reads/writes to the variable to be volatile? It's an interesting question, and something I've wondered before — jalf, Nov 07 '12 at 11:32
It doesn't matter whether `buffer` is volatile or not. The writes are through an lvalue of a volatile type and that's all that matters. [intro.execution]/12 — avakar, Nov 07 '12 at 11:33
@BigBoss: Nope, I can't use `volatile` for every variable that can possibly contain sensitive data. — sharptooth, Nov 07 '12 at 11:36
Possibly a stupid question, but is it not practical to just traverse the buffer after zeroing it out and perhaps just verify that each byte is set to zero? I know precious little about how clever optimising compilers are in this regard, but wouldn't a forced read in this way prevent optimising away the write? — Rook, Nov 07 '12 at 11:37
@Rook: That's a good question, but again, those reads don't affect observable behavior any more than the writes. — sharptooth, Nov 07 '12 at 11:42
Also, bear in mind that the standard describes the effects of volatile access using the C++ abstract machine. How this machine maps to your (real) architecture is debatable. Whether a volatile write ends up storing a value into RAM, whether it just emits the value into a shared cache, or whether it writes into a stale swap page is something you'd have to ask your compiler vendor. — avakar, Nov 07 '12 at 11:43
The latter, by the way, will not happen on any architecture I know. You have to make an effort to ensure that the password doesn't end up lying in the page file. — avakar, Nov 07 '12 at 11:45
@avakar: Yeap, the page file is a separate problem. However having the data in the address space is bad too - the same program can unintentionally read that data and maybe send it over Internet. — sharptooth, Nov 07 '12 at 11:47
And if volatile writes on your implementation do write through the cache to RAM (or even to the page file), the old value could also be in stale CPU caches that aren't coherent to the cache on the CPU that did the volatile write. Thus a different thread in the same program could still unintentionally send it over the internet, although obviously you've mitigated most of the risk if you can force the write to RAM. `SecureZeroMemory` doesn't have to worry about this, because Windows has cache coherency. — Steve Jessop, Nov 07 '12 at 11:48
@sharptooth, such a program would be in the realm of undefined behavior. You don't get to ask for standard compliance then. — avakar, Nov 07 '12 at 11:55
@avakar: Well, okay, but that shouldn't stop me from writing my code right. — sharptooth, Nov 07 '12 at 11:58
Perhaps you could issue a memory barrier to force the write. Not sure whether there is a potable C++ way to achieve this, but e.g. gcc provides `__sync_synchronize` as a bultin to ensure that the data just written becomes visible to other threads. — MvG, Nov 07 '12 at 12:19
A tipical OS can page memory and often the underlying machine architecture gives memory virtualization primitives to do so, so your sensitive data could be store in RAM, in a page file on disk or it could be placed inside the harddisk internal cache. Ensuring writes to the final memory location in RAM doesn't ensure that you sensitive data will be wiped out from all the places it could be stored. — Gianluca Ghettini, Nov 07 '12 at 22:03
@G_G: Yes, indeed, but getting to all those places is not trivial for an attacker, but the same program accidentally sending contents of its own address space is just trivial. — sharptooth, Nov 08 '12 at 06:34
I don't understand the question. How does `volatile` not do what you intended? — user541686, Nov 08 '12 at 08:21
@Mehrdad: According to the Standard it only matters how `buffer` declared, not the pointers to it. — sharptooth, Nov 08 '12 at 09:07
@Simon Richter: He likely refers to C++0x and I only have C++03 at hand and I can't find any equivalent in `intro.execution` of C++03 Standard. — sharptooth, Nov 08 '12 at 10:53
@avakar: Could you maybe post an answer with a citation from there? — sharptooth, Nov 08 '12 at 12:27
@sharptooth: In case you're still wondering, `[intro.execution]/12` in C++11 is just the thing about accesses through a volatile lvalues being side-effects. It doesn't say that they're observable any more than C++03 does. — Steve Jessop, Dec 06 '12 at 15:46

score 8 · Accepted Answer · answered Nov 08 '12 at 09:18

8

There is no portable solution. If it wants to, the compiler could have made copies of the data while you were using it in multiple places in memory and any zero function could zero only the one it's using at that time. Any solution will be non-portable.

answered Nov 08 '12 at 09:18

David Schwartz

179,497
17
214
278

score 4 · Answer 2 · edited Dec 04 '18 at 09:06

4

With library functions like SecureZeroMemory, the library writers will typically have taken pains to ensure that such functions will not be inlined by the compiler. This means that in the snippet

char buffer[size];
SecureZeroMemory( buffer, size );

the compiler does not know what SecureZeroMemory does with buffer, so the optimizer can't prove that taking the snippet out does not affect the observable behaviour of the program. In other words, the library writers will already have done all that is possible to ensure such code is not optimized away.

edited Dec 04 '18 at 09:06

Gelldur

11,187
7
57
68

answered Nov 08 '12 at 08:41

Bart van Ingen Schenau

15,488
4
32
41

In case of Visual C++ `SecureZeroMemory()` is implemented right in the `WinNT.h` header, so the compiler can see its inner workings. – sharptooth Nov 08 '12 at 09:08
@sharptooth: Then it is likely that the authors of `SecureZeroMemory` know that the use of `volatile` prevents the Visual C++ optimizer from optimizing the code away. The authors of `SecureZeroMemory` give certain guarantees about the function, so it is their job to ensure the optimizer does not void those guarantees. – Bart van Ingen Schenau Nov 08 '12 at 09:14
Well, great, now I have to port the code to some other compiler - what do I do? – sharptooth Nov 08 '12 at 09:16
@sharptooth: You find a function with similar guarantees that is available for the other compiler (or one available for both compilers), and then you optionally write a wrapper function to make the signatures of the functions compatible. This is basically the same effort you need to do for any function not available when porting code. – Bart van Ingen Schenau Nov 08 '12 at 09:22
1

@sharptooth: That's not your problem. To be compatible, the new compiler will have to support all the extensions needed to compile WinNT.h. This is no exception. – MSalters Nov 08 '12 at 09:35
@MSalters: `SecureZeroMemory()` itself is not what bothers me, the approach it uses is. – sharptooth Nov 08 '12 at 10:14
1

@sharptooth: For something with semantics as tricky as `SecureZeroMemory`, you're going to need some help from the compiler. No getting around that. Another compiler may use another approach, so don't get hung up on this particular approach. – MSalters Nov 08 '12 at 10:27

score 2 · Answer 3 · answered Nov 08 '12 at 10:00

2

The volatile keyword can be applied to a pointer (or reference, in C++) without requiring a cast, meaning that accesses through this pointer are not to be optimized out. The declaration of the variable does not matter.

The behaviour is analogous to const:

char buffer[16];
char const *p = buffer;

buffer[0] = 'a';          // okay
p[0] = 'b';               // error

That a const pointer to the buffer exists does not alter the behaviour of the variable in any way, only the behaviour of the modified pointer. If the variable is declared const, then it is forbidden to generate non-const pointers to it:

char const buffer[16];
char *p = buffer;         // error

Similarly,

char buffer[16];
char volatile *p = buffer;

buffer[0] = 'a';          // may be optimized out
p[0] = 'b';               // will be emitted

and

char volatile buffer[16];
char *p = buffer;         // error

The compiler is free to remove accesses through non-volatile lvalues as well as function calls where it can prove that no accesses to volatile lvalues happen.

The RtlSecureZeroMemory function is safe to use because the compiler can either see the definition (including the volatile access inside the loop or, depending on the platform, the assembler statement, which is opaque to the compiler and thus assumed to be unoptimizable), or it has to assume that the function will perform a volatile access.

If you wish to avoid the dependency on the <winnt.h> header file, then a similar function will work fine with any conforming compiler.

answered Nov 08 '12 at 10:00

Simon Richter

28,572
1
42
64

This is all great, but a tiny citation from C++ Standard (preferably C++03) would be just great. – sharptooth Nov 08 '12 at 10:55
1

That doesn't work. Nothing prevents the compiler from making a copy of the buffer for you to operate on with your `volatile` pointers so long as all subsequent accesses go to the copy. So that won't ensure the previous data is erased. All the previous accesses can be optimized however the compiler likes. – David Schwartz Nov 08 '12 at 11:38
@David Schwartz: Whatever, let's pretend it doesn't happen. How do I prevent writes being optimized away? – sharptooth Nov 08 '12 at 12:28
1

Just pretend they aren't optimized away then. – David Schwartz Nov 08 '12 at 13:02
@David Schwartz: Nope. I have a very specific problem. You say my problem is irrelevant, because there's a ton of other problems. I still want a solution for my problem. – sharptooth Nov 09 '12 at 08:19
@sharptooth: Perhaps we could help you if you told us what the problem actually was. – David Schwartz Nov 09 '12 at 14:26
@David Schwartz: The problem is I want code that is equivalent to `memset()` but is surely not optimized out. – sharptooth Nov 09 '12 at 14:35
@sharptooth: And the answer is that this is impossible to do portably. Sorry. You'll have to tell us your platform. – David Schwartz Nov 09 '12 at 14:39
@David Schwartz: Well, I can't know all the target platforms in advance. That's why I asked what the Standard has to say. – sharptooth Nov 09 '12 at 14:52
I agree with @David that the compiler is free to copy the buffer unless it is itself declared `volatile`. So the proper course of action is indeed to qualify the buffer declaration. – Simon Richter Nov 09 '12 at 15:04
1

@sharptooth: Then you have your answer -- it can't be done. Sorry. Your question is basically, "How can I do something that can't be done portably in a portable way?" And the answer is that you can't. You have an impossible requirement. – David Schwartz Nov 09 '12 at 15:10
May I have a citation for what "The compiler is free to remove accesses through non-volatile lvalues as well as function calls where it can prove that no accesses to volatile lvalues happen." statement is based upon please? – sharptooth Nov 12 '12 at 09:00
I don't have my copy of the standard at hand; the general idea is that the compiler is free to inline functions and move their contents around as long as the behaviour of the abstract machine is unchanged. Opaque functions are assumed to contain observable events, so they cannot be optimized out. – Simon Richter Nov 12 '12 at 09:29

score 1 · Answer 4 · answered Nov 08 '12 at 09:53

1

There is always a race condition between when there is sensitive information in memory and the time you wipe it out. In that window of time your application could crash and dump core or a malicious user could get a memory dump of the process' address space with sensitive information in plain text.

May be you should not store sensitive information in memory in plain text. This way you achieve better security and bypass this issue completely.

answered Nov 08 '12 at 09:53

Maxim Egorushkin

131,725
17
180
271

Memory dumps are just great, yet getting them is not trivial. It's much more likely that a program accidentally sends the data itself. – sharptooth Nov 08 '12 at 10:16

score 1 · Answer 5 · answered Jul 30 '18 at 16:12

Neither the C nor C++ Standard imposes any requirements on how implementations store things in physical memory. Implementations are free to specify such things, however, and quality implementations which are suitable for applications requiring certain physical-memory behaviors will specify that they will consistently behave in suitable fashion.

Many implementations process at least two distinct dialects. When processing their "optimizations disabled" dialect, they often document in great detail how many actions will interact with physical memory. Unfortunately, enabling optimizations will usually switch in a semantically weaker dialect which guarantees almost nothing about how any actions will interact with physical memory. While it should be possible to process many simple and straightforward optimizations while still processing things in a fashion that is consistent with the "optimizations disabled" dialect in certain easily-identifiable cases where it would be likely to matter, compiler writers aren't interested in providing modes that focuses on the safe low-hanging fruit.

The only reliable way to ensure that physical memory is treated in a certain fashion is to use a dialect that promises to treat physical memory in that fashion. If one does that, getting the required treatment will generally be easy. If one doesn't, nothing will guarantee that a "creative" implementation won't do something unexpected.

Is it possible to guarantee code doing memory writes is not optimized away in C++?

5 Answers5

Linked