6

Consider the code:

procedure DoSmthSecret;
var
  Seed: array[0..31] of Byte;

begin
// get random seed
  ..
// use the seed to do something secret
  ..
// erase the seed
  FillChar(Seed, SizeOf(Seed), 0);
end;

The problem with the code is: FillChar is a compiler intrinsic, and potentially a compiler can "optimize it out". The problem is known for C/C++ compilers, see SecureZeroMemory. Can modern Pascal compiler (Delphi, FPC) do such optimization, and if they can, do they provide SecureZeroMemory equivalent?

kludg
  • 27,213
  • 5
  • 67
  • 118
  • 1
    Delphi compiler certainly cannot do that optimisation, don't know about FPC – David Heffernan Mar 02 '16 at 09:40
  • FillChar is most definitely not optimized out. Anyway, optimization should never change the outcome, and nilling memory is such an outcome. – Rudy Velthuis Mar 02 '16 at 09:59
  • 1
    @RudyVelthuis Code that writes to variables that cannot subsequently be read can be optimised out. That's why `SecureZeroMemory` exists. – David Heffernan Mar 02 '16 at 10:00
  • Obviously it should not be optimized out. I understand that an optimizer would optimize out a variable write if the variable is not used anymore. But a call to a function, even an intrinsic one, should, IMO, never be optimized out. – Rudy Velthuis Mar 02 '16 at 10:56
  • @RudyVelthuis Programmers around the world that care about performance are glad that you are not writing their compilers. I want my compiler to remove code whose effects cannot be observed by correct programs. – David Heffernan Mar 02 '16 at 11:01
  • Well, if you need SecureZeroMemory, it also happens in a not correct case, obviously. – Marco van de Voort Mar 02 '16 at 11:06
  • I don't want my compiler to remove code that is obviously intentionally placed there. If it does that, then that is much worse than not getting the last few nanoseconds out of the code. – Rudy Velthuis Mar 02 '16 at 11:06
  • @RudyVelthuis How did you determine that such optimisations do not yield significant gains? Remember that compilers are designed to compile a broad range of code, and not designed just to compile the code in this question. – David Heffernan Mar 02 '16 at 11:09
  • Yes, but a compiler that optimizes out such code, e.g. a loop that doesn't obviously produce any output, or a ZeroMemory that does not obviously have any effect on the rest of the program, is naive. If people put in such code, it has a meaning and should not be optimized out. If it is not necessary,the people should take it out, not the optimizer. I am sure that most of those who want full speed do not rely on optimizers anyway. They will take out code that is not needed themselves. So remain those who are not speed gurus and want the compiler to do everything for them. – Rudy Velthuis Mar 02 '16 at 11:15
  • 1
    @RudyVelthuis Well, I care about perf and I want my compiler to do more for me. – David Heffernan Mar 02 '16 at 11:17
  • I want my compiler to do more too, but not to remove code that I deliberately put there. That is not optimization. The code may be faster, but it is wrong and that should not happen.. – Rudy Velthuis Mar 02 '16 at 11:23
  • 1
    @RudyVelthuis No, it's not wrong. How could it be wrong to avoid writing to a variable that is never read? I don't think you have a clear understanding of what optimisation is. – David Heffernan Mar 02 '16 at 11:36
  • 1
    `function SecureZeroMemory(ptr : Pointer;cnt : SIZE_T) : Pointer; begin FillChar(ptr^, cnt, 0); result := ptr; end;` ? – Remko Mar 02 '16 at 12:22
  • @Remko - this is a crutch; a better solution is to tell a compiler not to optimize out `FillChar` calls, locally or globally; but since the problem does not exist Pascal compilers today, it can be safely ignored. – kludg Mar 02 '16 at 12:46
  • 1
    @RudyVelthuis How does the compiler know whether you put the code there on purpose or left it there by accident? Delphi definitely needs to do more optimization, not less. We have to use the Intel C++ compiler from time to time to eek out more performance because the Delphi compiler is just not up to job. – Graymatter Mar 03 '16 at 02:09
  • @Graymatter: if it is there, it is there with a purpose.Optimizers do not have to erase my accidents. – Rudy Velthuis Mar 03 '16 at 07:24
  • @Rudy If the program's behaviour is not affected, why do you care? Do you understand the "as-if" rule? Accident or on purpose is not relevant. It's not hard to construct examples where code placed there on purpose can be removed safely. – David Heffernan Mar 03 '16 at 07:36
  • Exactly because of situations as the current question. Code that nulls memory should obviously not be eliminated here. – Rudy Velthuis Mar 03 '16 at 07:45
  • @Rudy It's not at all obvious to a compiler. They don't have the intelligence of a human. They have "as-if" rule. If the compiler won't optimise code like this then real world programs will be slower. There are people for whom that matters. It's as if you don't accept that. – David Heffernan Mar 03 '16 at 07:54
  • Of course it is not obvious to the compiler. That is why it should never remove code put there by the programmer. As I said, there are many more things it can do to optimize. It should just not eliminate code. It can warn or hint, but not simply eliminate, The fact that there are mechanisms like SecureZeroMemory, which can not be eliminated, in place means that some compilers (e.g. VC++) sometimes eliminate code that should not be eliminated. That is bad. – Rudy Velthuis Mar 03 '16 at 08:52
  • @RudyVelthuis A good optimizer has to remove code. What about a simple `if false then`? The code following it is worthless. It will never be executed. The compiler treats `a := 1; a := 3` the same way. Why do 2 assignments? The point in an optimizer is to speed up the code as much as possible without changing the result. In this case, it's reasonable for a compiler to assume that the code serves no purpose and to strip it out. There is always the option of adding `{$O-}` before such code and to put `{$O+}` after it. That way you are indicating that the code is relevant to you. – Graymatter Mar 04 '16 at 00:54
  • A good optimizer can remove the `then` clause of `if False then`, or code after a `return` or `Exit`, since it can't be executed. But an unconditional function call can be executed, even if, to the compiler, it doesn't make sense. Hands off. Hint or warning are OK, blunt optimization isn't. – Rudy Velthuis Mar 04 '16 at 07:14
  • @David: why do people always tell others they have no understanding of the issues, only because they don't agree? I fully understand how optimizers work and I think that some optimizers go too far. It is pretty clear that the programmer wanted FillChar (or memset) to clear memory. The optimizer doesn't know why a function call is there, and so shouldn't assume it can be removed. – Rudy Velthuis Mar 04 '16 at 07:18
  • @Rudy I don't think you have any understanding of the "as-if" rule. Do you? We aren't discussing opinion here. As-if says that this code can be removed. That's a simple fact. – David Heffernan Mar 04 '16 at 07:27
  • Yes, I know C++ has such a rule. But that doesn't mean it is OK. And, AFAIK, most C++ optimizers currently in use would not eliminate `memset()`. Heck, C++ even allows copy elision, i.e. the elimination of copy or move constructors, *even if they have observable side effects*. That is, IMO, absolutely a no-no. – Rudy Velthuis Mar 04 '16 at 11:59
  • In the case here: I think it is fine if the call to the intrinsic `FillChar` is replaced by more optimized inline code that nulls the same memory (setting the 32 bit seed to 0 directly should be easier than calling `FillChar`). I think it would be utterly wrong if the compiler ever eliminated the entire nulling. – Rudy Velthuis Mar 04 '16 at 12:04
  • It's all very well you deciding that you don't like an entire class of optimisations just because of one exceptionally obscure corner case. A corner case that is outside the scope of the language, and perfectly easy to work around. But if you take that point of view you'll end up banning almost all optimisations. Why pick on function calls. The assignment operator can also be used to modify a variable's value. You are going to object to any optimisation that removes assignment operators too? If you are going to reject the as-if rule, can you propose something better? – David Heffernan Mar 04 '16 at 12:04
  • So what about `a := 1; a := 3;`. You would also object to the compiler removing the first assignment? You cannot have it both ways. – David Heffernan Mar 04 '16 at 12:05
  • Eliminating the first if these are simple assignments of built-in types (i.e. the assignment does not call an overloaded implicit operator) is fine. If these are operator overloads, the call might have (probably desirable - even if they are not observable) side effects and should not be removed. But then we are talking about user or external library code, and these are not eliminated anyway. – Rudy Velthuis Mar 04 '16 at 12:40
  • But I would actually prefer that the compiler does not remove the first call. It should, IMO, simply hint or warn instead, so the programmer can take action, if he thinks it should be eliminated. – Rudy Velthuis Mar 04 '16 at 12:42
  • Yes, but you have to come up with a clear framework and set of rules for deciding whether or not such optimisations are allowed. That's what "as-if" gives you. Now suppose we had, `a := 1; foo(); a := 3;` where `a` is a local that cannot be seen inside `foo`. Can we remove the first assignment to `a`? – David Heffernan Mar 04 '16 at 12:42
  • I don't care what "as-if" gives me. I'm sure that, if pressed to do so, I could come up with my own set of rules, and they would not eliminate as much as "as-if" allows. Actually, most compilers are not that radical. – Rudy Velthuis Mar 04 '16 at 12:44
  • I very much doubt that you could come up with a decent set of rules. You've not come close even for this simple example. Quite how you feel that you are better than the C++ standards committee is astonishing. – David Heffernan Mar 04 '16 at 12:45
  • @david: you can come up with a lot of scenarios. I won't comment on all of them. If you want me to write a set of rules or to write an optimizing compiler, fine, but then I should be paid for it. – Rudy Velthuis Mar 04 '16 at 12:45
  • I don't care what you doubt. I am sure I could. But they would be different and not as radical as what "as-if" or copy elision allow. – Rudy Velthuis Mar 04 '16 at 12:46
  • 1
    That's the point. You cannot rubbish "as-if" on the basis of a single example. You need an in depth knowledge of a wide range of sample input programs, and potential optimisations. That's what the standards committee has. You and I don't have that knowledge. I'm a programmer with a rather narrow field of interest and experience. You are a dentist. – David Heffernan Mar 04 '16 at 12:47
  • Actually, I **can** rubbish it exactly because of that example. It produces code that does not do what the programmer tells it to do. C++ programmers may take that for granted and even try to program around it, but I wouldn't. If the optimizer notices such a problem, it should warn about it, if told to do so, but it should not simply remove the code. I am glad that Delphi does not eliminate such code. – Rudy Velthuis Mar 04 '16 at 12:52
  • A classic case of one extreme corner case being used to influence the design and make the mainstream usage worse. – David Heffernan Mar 04 '16 at 12:57
  • Even extreme corner cases should not cause a fault. The fact is goes well most - but not all - of the time means it is faulty, and unreliable. – Rudy Velthuis Mar 04 '16 at 13:09
  • @RudyVelthuis There is no fault. The optimisation does not change observable behaviour for any legal program. The as-if rule is satisfied. I guess what you are getting mixed up over is the difference between the language and a specific implementation on a specific platform. – David Heffernan Mar 04 '16 at 13:24
  • But who says that optimization should only retain the *observable* behaviour? Yes, your favourite "as-if" rule does, but I don't agree with that. It should leave the *intended* behaviour, as indicated by the code written. So it should not even optimize `a := 3; a := 7;`, if a is not a POD type. It should warn, at most. It can eliminate *dead* code, i.e. unreachable code, sure, if it does not contain a label. You just seem to be stuck in the "as-if" rule, as if it were the only possible option. – Rudy Velthuis Mar 05 '16 at 08:56
  • @Rudy If you could come up with an Alternative, you'd do so. You are incapable of doing so. The standard doesn't mandate much about the machine. You can't say much other than in terms of observable behaviour. Given the nature of the standard as-if is unavoidable. You argue for banning all optimisations that remove Assignments. All that reorder. All that enregister. You'd have to erect memory barriers to avoid CPU reordering. And for what? The optimisation in the Q doesn't cause any problems! – David Heffernan Mar 05 '16 at 12:43
  • Of course I can come up with an alternative, but it would require some work. The fact that you want me to prove something is not incentive enough, sorry. And of course "observable behaviour" is not the only criterion one can choose. Have you no imagination? – Rudy Velthuis Mar 05 '16 at 13:44
  • @rudy If you could do it, you would. But you can't. And when you try to it will be full of holes. We've already seen that. – David Heffernan Mar 05 '16 at 14:58
  • The reason you can't write your variant of as-if to exclude this one specific optimisation is that nothing in the standard provides you with the tools to express what you are attempting to state. The standard is far too general to allow you to do this. – David Heffernan Mar 05 '16 at 15:51
  • Of course I can. But it is not a simple one liner and I certainly don't want to waste time on it. Fact is that dead code elimination is only a small part of optimization, so even if it were not allowed at all, it would not make a big difference. – Rudy Velthuis Mar 05 '16 at 19:01
  • I don't really want to write a variant of "as-if", or any other rule. I don't care if they have it in C++. What makes you think there is nothing in the standard that lets me express what I want to state? – Rudy Velthuis Mar 05 '16 at 19:04
  • The standard doesn't go into sufficient detail for you to express what you want. You could read it. It would also be pointless to ban this optimisation. – David Heffernan Mar 05 '16 at 21:41

2 Answers2

3

FPC can't do such optimizations at the moment, and afaik even with C++ they belong into the "uncertain" class. (since the state of the program due to this optimization ignores what the programmer tells it to be)

Solving such problem is a matter of defining which constructs can be optimized out and which not. It doesn't need API/OS assistance per se, any externally linked object file with such function would do (since then global optimization wouldn't touch it)

Note that the article doesn't name the C++ compiler specifically, so I expect it is more a general utility function for when an user of a compiler gets into problems, without hitting the docs too hard, or when it must easily work on multiple (windows-only!) compilers without overly complicating the buildsystem.

Choosing a non inlinable API function might be non optimal in other cases, specially with small, constant sizes to zero, since it won't be inlined, so I would be careful with this function, and make sure there is a hard need

It might be important mainly when an external entity can change memory (DMA, memory mapping etc) of a program, or to erase passwords and other sensitive info from the memory image, even if the program according to the compiler will never read it

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • " (since the state of the program due to this optimization ignores what the programmer tells it to be)" Exactly. Such code is there for a purpose and should never be optimized out. – Rudy Velthuis Mar 02 '16 at 11:18
  • Such optimization levels AND functions like SecureZeroMemory are for people that know what they are doing AND a lot of time to do it. If there were no downsides to higher optimization levels they would be default. – Marco van de Voort Mar 02 '16 at 11:23
  • I still don't want an optimizer to eliminate code that may look dead but isn't. If that code is not needed, I will take it out. – Rudy Velthuis Mar 02 '16 at 11:35
  • 1
    @RudyVelthuis Your argument could be used with almost any optimisation. It seems that you believe that compilers should not perform any optimisation. For example, you clearly believe that, where `i` is a local, that when presented with `i := 42; i := 666;` both assignments should be performed. – David Heffernan Mar 02 '16 at 11:48
  • 1
    C++ compilers really do it - see http://stackoverflow.com/questions/15538366/can-memset-function-call-be-removed-by-compiler – kludg Mar 02 '16 at 11:56
  • No, I don't believe that they should not optimize at all. That's nonsense. But optimization is much more than dead code elimination. I have problems with what some optimizers identify as dead code. ISTM that loops, function calls, etc. even if they serve no obvious purpose, should not be eliminated. The compiler can give a hint or warning, fine. – Rudy Velthuis Mar 02 '16 at 12:09
  • 3
    @RudyVelthuis It is nonsense, but it is the logical conclusion of your argument. – David Heffernan Mar 02 '16 at 12:09
  • Updated with the reason why it is "secure". Probably should have read MSDN fully. I've been doing too much embedded lately, so my mind directly went off in the direction of DMA etc. – Marco van de Voort Mar 02 '16 at 14:21
  • @David: no it certainly isn't. Dead code elimination is only one part of optimizing. Arranging registers, aligning, and the many other higher level optimizations like loop unrolling, etc. can still take place. – Rudy Velthuis Mar 03 '16 at 07:27
  • @Rudy Your argument is illogical and inconsistent. You haven't absorbed the fact that compiler writers target the full range of input programs. Not just your hand picked programs. Removing code that cannot be reached can improve performance. Removing code that has no effect on the program can improve performance. It's easy to construct more interesting examples where it's hard for the coder to remove such code, but where the compiler can. – David Heffernan Mar 03 '16 at 07:34
  • I think it is mainly something from the embedded space where dead code saves flash. Most big projects avoid higher levels of optimization because the risk of compiler bugs increases. A few years ago, FreeBSD default used gcc -O – Marco van de Voort Mar 03 '16 at 09:25
  • @MarcovandeVoort Dead code takes up space and that can impact on caching performance when the CPU has to read the code that it is to execute. – David Heffernan Mar 03 '16 at 10:34
  • I think TLB and actual clocks for removed instructionssave is more important than the load/space cache effects. But anyway that is not where that kind of optimizations were developed for. They were developed for the embedded space where "global" optimizations started because there they are easier, and the gains larger like eliminate variables (and thus RAM) and code (thus flash, and the hope to squeeze it in a chip one size category less). Anyway, if you have benchmarks that show a significant effect otherwise, I'd be interested – Marco van de Voort Mar 03 '16 at 12:19
  • @David: My argument is not illogical and inconsistent. If they target the full range, but their optimization produces undesired results in only a few, it is still bad and should not be impemented that way. A math library that produces a bad result every 100,000 calculations is still faulty. Or are you saying: "yes, it may not work out as desired a few times, but most of the time it works speldidly, so that's OK"? – Rudy Velthuis Mar 04 '16 at 12:58
  • @RudyVelthuis Your statements all follow on from your lack of understanding of as-if – David Heffernan Mar 04 '16 at 12:59
  • That's nonsense, and you know it. I fully understand "as-if", I just don't agree with it and with its implications. – Rudy Velthuis Mar 04 '16 at 13:01
  • @Rudy No, you really don't understand as-if. The optimisation consider here is legal because of the as-if rule. That security implications for a program under attack go beyond the as-if rule is really another matter. The as-if rule is designed around the language, not implementation and platform specific issues. If issues like this were widespread then the standards committee might have thought differently. You want to throw the baby out with the bath water because you do not understand why as-if came about, and do not have any empathy with other developers who might have different motivations – David Heffernan Mar 04 '16 at 13:11
  • Yes, it is legal, no doubt. I just don't agree with it. I don't think an optimizer should do that, regardless of platform, and apparently most C++ optimizers don't do it. IMO, it should not be legal. Is that so hard to understand? – Rudy Velthuis Mar 04 '16 at 16:14
  • @Rudy The problem is that you haven't given a rule that can be applied by compiler authors. – David Heffernan Mar 04 '16 at 18:20
  • Well, as soon as a compiler author asks me, a dentist, I will gladly formulate such a rule, unambiguously. But until then, why should I? – Rudy Velthuis Mar 04 '16 at 18:22
  • @RudyVelthuis What you are advocating is the LCD approach. That is one solution to the problem but not a very optimal one. They try put something together that satisfies the majority of use cases. – Graymatter Mar 04 '16 at 19:11
  • @Graymatter: I have no idea what the LCD approach is, or how (sub)optimal it is, sorry. Do you have a link? – Rudy Velthuis Mar 04 '16 at 19:17
  • If it satisfies the majority of use cases, but eliminates code that should not be eliminated, it is, well, suboptimal too. What good does it do when my code is fast but sometimes wrong? – Rudy Velthuis Mar 04 '16 at 19:24
  • @RudyVelthuis LCD = Lowest common denominator. What I was saying is that you are looking for an approach that doesn't "modify" any code. So you only want registers and a few other optimizations. It's just a very limiting approach. – Graymatter Mar 04 '16 at 19:24
  • I am not saying that it should not modify any code. That would be a pretty lame (but very fast ) optimizer. It should however not *remove* explicit function calls, not even to intrinsic functions. It can of course *replace* intrinsic functions by something more optimized but doing exactly the same, e.g. a simple `MOV RandomSeed,0` or the platform and language equivalent. So no, not a lowest common denominator. Elimination of code is only one of many optimization techniques that can be used by an optimizer. I would have no problem with others. Most optimizers can rearrange code pretty well. – Rudy Velthuis Mar 04 '16 at 19:32
  • Graymatter: your approach is called "the idiot and the salt". A bit of salt (read:optimization) is good, so a whole lot must be great. In reality optimization have diminishing returns. Just look on e.g. recent phoronix benchmarks and see that when average over multiple benchmarks anything beyond the first (-O aka -O1) hardly improves. Uncertain optimizations have uses, but they are not general purpose, and therefore are not (and IMHO should not) be default – Marco van de Voort Mar 04 '16 at 20:09
  • @rudy You can't come up with a rule that is better than "as-if". Even if you were paid. You have no comprehension of what's involved in writing language standards. – David Heffernan Mar 04 '16 at 20:34
  • @MarcovandeVoort I am not saying that at all. What I am saying is the opposite. Just because there is the odd person that doesn't like salt that doesn't mean that you shouldn't add salt to any food. There is a balance. It's nice to have options for these (like GCC) but I would prefer the optimizer to work in a standard way. The problem is this. Optimizing code can take a substantial amount of time and the compiler can do things that we can't, at least not without a lot of work. If the compiler can see a better way to accomplish the same thing using something like "as-if" then it's beneficial. – Graymatter Mar 04 '16 at 21:04
  • @Marco What do you mean by "uncertain"? To me this is a routine application of the as-if rule. – David Heffernan Mar 04 '16 at 21:04
  • to the majority of users. For the odd cases (like security), where people need something different then they can work around it easily. – Graymatter Mar 04 '16 at 21:04
  • @David: Of course I can come up with a rule or rules better than "as-if". Everyone can. What makes you say I don't understand? Just because I don't agree with the "as-if" rule in C++? That's baloney. – Rudy Velthuis Mar 05 '16 at 08:43
  • @Rudy No you can't come up with a rule. Remember that such a rule is to be framed in the context of the ISO C++ language standard. Your informal attempts are already woefully lacking. You've identified functions as not to be optimised away. But you missed the assignment operator. If this was so easy you'd just state the rule. But you can't. Because "as-if" is the only viable choice given the constraints. – David Heffernan Mar 05 '16 at 08:59
  • What nonsense. Of course I can come up with such a rule. Everyone can, especially if they are not fixed on retaining *observable* behaviour only. – Rudy Velthuis Mar 05 '16 at 09:02
  • Assignment operators, if overloaded, are functions too. What is the problem? And I never said it was easy. – Rudy Velthuis Mar 05 '16 at 09:02
  • @rudy never mind overloaded. I'm talking about in built. `seed = 0;`. You are talking utter nonsense due to a lack of understanding. – David Heffernan Mar 05 '16 at 09:12
  • I admit I don't understand what you are trying to say. I do, however, fully understand what can be done to optimize and what should, IMO, not be done. I do understand the "as-if" rule, but I think it goes too far. – Rudy Velthuis Mar 05 '16 at 09:31
  • If someone writes `seed = 0;`, that is intentional, and should not be eliminated, even if setting the seed to 0 has no *observable* behaviour. – Rudy Velthuis Mar 05 '16 at 09:33
  • @Rudy So your rule is what? Framed in the context of the language. As I said earlier, you are going to end up banning practically all optimisations. For what benefit? – David Heffernan Mar 05 '16 at 10:03
  • Of course I am not ending all optimizations. I merely want to restrict ***dead code elimination*** to the elimination of unreachable code only. Sheesh. I don't know what you know about optimization, except your pet "as-if" rule, but dead code elimination is generally only a small part of optimizing. A moderately good programmer will hardly produce any dead code, and the little he produces can easily be "warned away". – Rudy Velthuis Mar 05 '16 at 13:51
  • (classically, dead code elimination is a linker's game) As for uncertain optimization I might have been wrong there. It was an internal term for anything that could break an standards complying application that was running fine. The trouble was the breakage comes from outside the complying application. – Marco van de Voort Mar 05 '16 at 14:41
  • @Rudy Why would you want to avoid eliminating code that has no observable effect? What would be the benefit of doing that? – David Heffernan Mar 05 '16 at 14:56
1

Even if FreePascal would optimize out writing to memory that is never read again (which I doubt it does atm, regardless of how long you guys discuss it), it does support the absolute type modifier which it guarantees (documented) to never optimize (somewhat similar to volatile in C/C++).

tofro
  • 5,640
  • 14
  • 31