88

I am encrypting the user's input for generating a string for password. But a line of code gives different results in different versions of the framework. Partial code with value of key pressed by user:

Key pressed: 1. Variable ascii is 49. Value of 'e' and 'n' after some calculation:

e = 103, 
n = 143,

Math.Pow(ascii, e) % n

Result of above code:

  • In .NET 3.5 (C#)

    Math.Pow(ascii, e) % n
    

    gives 9.0.

  • In .NET 4 (C#)

    Math.Pow(ascii, e) % n
    

    gives 77.0.

Math.Pow() gives the correct (same) result in both versions.

What is the cause, and is there a solution?

Habib
  • 219,104
  • 29
  • 407
  • 436
Rajiv Bhardwaj
  • 293
  • 3
  • 9
  • 2
    For the sake of debugging, if you hard code the result of `Math.Pow` does it work? – dee-see Jan 31 '14 at 13:59
  • 1
    It would be trivial for you to provide an SSCCE. Please do so. – David Heffernan Jan 31 '14 at 14:02
  • 1
    @David what is a SSCCE?? – RononDex Jan 31 '14 at 14:02
  • 2
    @RononDex [SSCCE](http://sscce.org/) – Vincent van der Weele Jan 31 '14 at 14:03
  • 3
    Don't work with floats, but use a [`BigInteger`](http://msdn.microsoft.com/en-us/library/system.numerics.biginteger(v=vs.100).aspx) struct to hold the values. There are implementations out there for .NET 3.5 – John Alexiou Jan 31 '14 at 14:05
  • 2
    Note that correct answer is 114 ;) After this, answer for your question comes automatically – Boris Parfenenkov Jan 31 '14 at 14:06
  • 12
    Of course, both answers in the question are wrong. The fact that you don't seem to care about that is, well, worrying. – David Heffernan Jan 31 '14 at 14:08
  • Also look into `Math.DivRem()` – John Alexiou Jan 31 '14 at 14:09
  • 34
    You need to go back several steps. "I am encrypting the user's input for generating a string for password" this part is already dubious. What do you actually want to do? Do you want to store a password in encrypted or hashed form? Do you want to use this as entropy to generate a random value? What are your security goals? – CodesInChaos Jan 31 '14 at 14:50
  • 49
    While this question illustrates an interesting issue with floating point arithmetic, if the OP's goal is "encrypting the user's input for generating a string for password", I don't think rolling your own encryption is a good idea, so I wouldn't recommend actually implementing any of the answers. – Harrison Paine Jan 31 '14 at 14:51
  • 18
    Nice demonstration why other languages forbid the use of `%` with floating-point numbers. – Ben Voigt Jan 31 '14 at 17:21
  • 5
    While the answers are good, none of them answer the question of what has changed between .NET 3.5 and 4 that is causing the different behaviour. – msell Feb 01 '14 at 07:37
  • 2
    @msell because it doesnt matter :) aka if it doesnt break legit code nobody cares – NoSenseEtAl Feb 01 '14 at 07:42
  • 5
    The two highly-voted-up comments saying that the OP's whole approach is wrong, seem somewhat irrelevant and pedantic. He wants to know something about floating-point calculations. Maybe his methodology for encrypting passwords isn't very good... I doubt he is writing a piece of industrial software which is going to be put on ATMs or in nuclear power stations. Isn't it more likely that he playing with toy code to implement something interesting he read about? – jwg Feb 03 '14 at 09:08
  • 2
    @jwg see [Why do people question every question](http://meta.stackexchange.com/questions/108060) and the FAQ post [What is the XY problem](http://meta.stackexchange.com/questions/66377) – AakashM Feb 03 '14 at 09:52
  • @jwg: Fair enough, and I agree that the question merited a relevant answer. However, it's important that the issue is pointed out in case the question is read by someone who *is* attempting to implement their own encryption algorithm for industrial software. – Douglas Feb 03 '14 at 09:59
  • @Douglas, do you think it is likely that someone who is attempting to implement an encryption algorithm for industrial software will encounter the same problem with modular arithmetic differing between versions 3.5 and 4.0 or .Net? – jwg Feb 03 '14 at 10:04
  • @jwg: Yes (if they're diligent enough to test for it). You'd be surprised at how many developers attempt to implement their own encryption (driven by a sense of [security through obscurity](http://en.wikipedia.org/wiki/Security_through_obscurity)). – Douglas Feb 03 '14 at 10:11
  • @Douglas: I would certainly be surprised if a large number of clueless yet diligent developers, all working in positions of responsibility in the nuclear power industry, all chose to implement exactly this algorithm for encrypting passwords using the same .NET versions as the OP. – jwg Feb 03 '14 at 10:34
  • @jwg: I wasn't referring to the nuclear power industry (which is an minuscule portion of the software market). Are you saying that badly-written encryption algorithms only affect nuclear plants? – Douglas Feb 03 '14 at 10:41
  • 3
    @Douglas, no, I am saying that the urge for everyone who has read 'Practical Cryptography' to spout advice about using standard hashing algorithms to people who are playing with toy systems for their own amusement, is no less annoying, and probably no less dangerous, than the urge for people implementing security software to come up with their own poorly designed protocols on the back of cigarette packets. – jwg Feb 03 '14 at 10:50
  • 1
    @jwg: I politely disagree. Although it might be annoying, there is nothing "dangerous" about repeating what should be common knowledge. I might have conceded your point if the OP had confirmed that they were just "playing with toy systems for their own amusement"; but, in the absence of that assurance, your insistence on withholding practical advice is the one that's dangerous. – Douglas Feb 03 '14 at 10:56

6 Answers6

159

Math.Pow works on double-precision floating-point numbers; thus, you shouldn't expect more than the first 15–17 digits of the result to be accurate:

All floating-point numbers also have a limited number of significant digits, which also determines how accurately a floating-point value approximates a real number. A Double value has up to 15 decimal digits of precision, although a maximum of 17 digits is maintained internally.

However, modulo arithmetic requires all digits to be accurate. In your case, you are computing 49103, whose result consists of 175 digits, making the modulo operation meaningless in both your answers.

To work out the correct value, you should use arbitrary-precision arithmetic, as provided by the BigInteger class (introduced in .NET 4.0).

int val = (int)(BigInteger.Pow(49, 103) % 143);   // gives 114

Edit: As pointed out by Mark Peters in the comments below, you should use the BigInteger.ModPow method, which is intended specifically for this kind of operation:

int val = (int)BigInteger.ModPow(49, 103, 143);   // gives 114
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • 20
    +1 for pointing out the real problem, namely that the code in the question is plain wrong – David Heffernan Jan 31 '14 at 14:08
  • 36
    It's worth noting that BigInteger provides a ModPow() method that performs (in my quick test just now) about 5 times faster for this operation. – Mark Peters Jan 31 '14 at 14:22
  • @MarkPeters: Thanks for the observation! I'll edit it into the answer. – Douglas Jan 31 '14 at 14:24
  • 8
    +1 With the edit. ModPow is not just fast, it is numerically stable! – Ray Jan 31 '14 at 14:49
  • 1
    @Ray: How does numerical stability apply to full-precision integers? `BigInteger` operations should never introduce any error. – Douglas Jan 31 '14 at 14:55
  • Shouldn't the OPs code throw an exception of some kind? Would this be considered an overflow of sorts? – makerofthings7 Jan 31 '14 at 17:48
  • 2
    @maker No, the answer is *meaningless*, not *invalid*. – Cody Gray - on strike Jan 31 '14 at 18:33
  • @CodyGray Is there ever a scenario this could have meaning? I would think that the developer should be alerted to or prevented from meaningless actions. I think this would be especially true in a type safe language like C# where casting is the key mechanism for protecting against meaningless assignments and usage. – makerofthings7 Jan 31 '14 at 18:54
  • @Douglas, you're right, I wasn't clear but I had in mind the modpow "approach" vs the pow-then-mod approach. In other words, this could stably be done with integers alone because it is so stable. But you are right, it doesn't have a direct effect in the situation provided. – Ray Jan 31 '14 at 19:55
  • 2
    @Ray: Thanks for the explanation; I now get what you meant (in the context of [modular exponentiation](http://en.wikipedia.org/wiki/Modular_exponentiation) algorithms). – Douglas Jan 31 '14 at 20:43
  • 3
    @makerofthings7: I agree with you in principle. However, imprecision is inherent to floating-point arithmetic, and it is deemed more practical to expect developers to be aware of the risks, than to impose restrictions on operations in general. If one wanted to be truly "safe", then the language would also need to forbid floating-point equality comparisons, to avoid unexpected results such as `1.0 - 0.9 - 0.1 == 0.0` evaluating to `false`. – Douglas Feb 01 '14 at 09:24
71

Apart from the fact that your hashing function is not a very good one *, the biggest problem with your code is not that it returns a different number depending on the version of .NET, but that in both cases it returns an entirely meaningless number: the correct answer to the problem is

49103 mod 143 = is 114. (link to Wolfram Alpha)

You can use this code to compute this answer:

private static int PowMod(int a, int b, int mod) {
    if (b == 0) {
        return 1;
    }
    var tmp = PowMod(a, b/2, mod);
    tmp *= tmp;
    if (b%2 != 0) {
        tmp *= a;
    }
    return tmp%mod;
}

The reason why your computation produces a different result is that in order to produce an answer, you use an intermediate value that drops most of the significant digits of the 49103 number: only the first 16 of its 175 digits are correct!

1230824813134842807283798520430636310264067713738977819859474030746648511411697029659004340261471771152928833391663821316264359104254030819694748088798262075483562075061997649

The remaining 159 digits are all wrong. The mod operation, however, seeks a result that requires every single digit to be correct, including the very last ones. Therefore, even the tiniest improvement to the precision of Math.Pow that may have been implemented in .NET 4, would result in a drastic difference of your calculation, which essentially produces an arbitrary result.

* Since this question talks about raising integers to high powers in the context of password hashing, it may be a very good idea to read this answerlink before deciding if your current approach should be changed for a potentially better one.

Community
  • 1
  • 1
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 20
    Good answer. The real point is that this is a terrible hash function. OP needs to rethink the solution and use a more appropriate algorithm. – david.pfx Jan 31 '14 at 14:43
  • 1
    Isaac Newton: Is it possible that the moon is attracted to the earth in the same way that the apple is attracted to the earth? @david.pfx: The real point is that this is a terrible way to pick apples. Newton needs to rethink the solution and perhaps hire a man with a ladder. – jwg Feb 03 '14 at 09:16
  • 2
    @jwg David's comment got that many upvotes for a reason. The original question made it clear that the algorithm was being used to hash passwords, and it is indeed a terrible algorithm for that purpose - it is extremely likely to break between versions of the .NET framework, as has already been demonstrated. Any answer that doesn't mention that the OP needs to *replace* his algorithm rather than "fix" it is doing him a disservice. – Chris Feb 03 '14 at 15:50
  • @Chris Thanks for the comment, I edited to include David's suggestion. I didn't word it as strongly as you, because OP's system may be a toy or a throw-away piece of code that he builds for his own amusement. Thanks! – Sergey Kalinichenko Feb 03 '14 at 16:17
26

What you see is rounding error in double. Math.Pow works with double and the difference is as below:

.NET 2.0 and 3.5 => var powerResult = Math.Pow(ascii, e); returns:

1.2308248131348429E+174

.NET 4.0 and 4.5 => var powerResult = Math.Pow(ascii, e); returns:

1.2308248131348427E+174

Notice the last digit before E and that is causing the difference in the result. It's not the modulus operator (%).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Habib
  • 219,104
  • 29
  • 407
  • 436
  • 3
    holy cow is this the ONLY answer to the OPs question? I read all the meta "blah blah security wrong question I know more than you n00b" and still wondered "why the consistent discrepancy between 3.5 and 4.0? Ever stubbed your toe on a rock while looking at the moon and asked "what kind of rock is this?" Only to be told "Your real problem is not looking at your feet" or "What do you expect when wearing home-made sandals at a night?!!!" THANKS! – Michael Paulukonis Feb 04 '14 at 20:51
  • 1
    @MichaelPaulukonis: That's a false analogy. Study of rocks is a legitimate pursuit; performing arbitrary-precision arithmetic using fixed-precision data types is just plain wrong. I'd compare this to a software recruiter inquiring why dogs are worse than cats at writing C#. If you're a zoologist, the question might hold some merit; for everyone else, it's pointless. – Douglas Feb 21 '14 at 13:23
23

Floating-point precision can vary from machine to machine, and even on the same machine.

However, the .NET make a virtual machine for your apps... but there are changes from version to version.

Therefore you shouldn't rely on it to produce consistent results. For encryption, use the classes that the Framework provides rather than rolling your own.

Community
  • 1
  • 1
Joe
  • 122,218
  • 32
  • 205
  • 338
10

There are a lot of answers about the way the code is bad. However, as to why the result is different…

Intel's FPUs use the 80-bit format internally to get more precision for intermediate results. So if a value is in the processor register it gets 80 bits, but when it is written to the stack it gets stored at 64 bits.

I expect that the newer version of .NET has a better optimizer in its Just in Time (JIT) compilation, so it is keeping a value in a register rather than writing it to the stack and then reading it back from the stack.

It may be that the JIT can now return a value in a register rather than on the stack. Or pass the value to the MOD function in a register.

See also Stack Overflow question What are the applications/benefits of an 80-bit extended precision data type?

Other processors, e.g. the ARM will give different results for this code.

Community
  • 1
  • 1
Ian Ringrose
  • 51,220
  • 55
  • 213
  • 317
5

Maybe it's best to calculate it yourself using only integer arithmetic. Something like:

int n = 143;
int e = 103;
int result = 1;
int ascii = (int) 'a';

for (i = 0; i < e; ++i) 
    result = result * ascii % n;

You can compare the performance with the performance of the BigInteger solution posted in the other answers.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ronald
  • 2,842
  • 16
  • 16
  • 7
    That would require 103 multiplications and modulus reductions. One can do better by computing e2=e*e % n, e4=e2*e2 % n, e8=e4*e4 % n, etc. and then result = e *e2 %n *e4 %n *e32 %n *e64 %n. A total of 11 multiplications and modulus reductions. Given the size of numbers involved, one could eliminate a few more modulus reductions, but that would be minor compared to reducing 103 operations to 11. – supercat Jan 31 '14 at 16:03
  • 2
    @supercat Nice mathematics, but in practice only relevant if you're running this on a toaster. – alextgordon Jan 31 '14 at 17:25
  • 7
    @alextgordon: Or if one is planning to use larger exponent values. Expanding the exponent value to e.g. 65521 would take about 28 multiplies and modulus reductions if one uses strength reduction, versus 65,520 if one doesn't. – supercat Jan 31 '14 at 17:32
  • +1 for giving an accessible solution where it's clear exactly how the calculation is done. – jwg Feb 03 '14 at 09:12
  • 2
    @Supercat: you're absolutely right. It's easy to improve the algorithm, which is relevant if either it is calculated very often or the exponents are large. But the main message is that it can and should be calculated using integer arithmetic. – Ronald Feb 05 '14 at 14:48