18

I'm working on a encryption functionality based on classes inherited from SymmetricAlgorithm such as TripleDes, DES, etc.

Basically there're two options to generate consistent key and IV for my algorithm class, PasswordDeriveBytes and Rfc2898DeriveBytes, both inherit from DeriveBytes abstract class.

The PasswordDeriveBytes.GetBytes() method is marked as obsolete in .NET framework while Rfc2898DeriveBytes.GetBytes() is recommended, as it matches the PBKDF2 standard. However, based on my testing, calling the same GetBytes() method in Rfc2898DeriveBytes class is almost 15 times slower than that in PasswordDeriveBytes class, which leads to unexpected CPU usage (always higher than 50%).

Here're some testing data:

  • Iterations: 100
  • Algorithm type: DES
  • Original Text: "I'm a test key, encrypt me please"
  • Time:
    • PasswordDeriveBytes: 99ms
    • Rfc2898DeriveBytes: 1,373ms

Based on the testing, the bad performance of Rfc2898DeriveBytes is not acceptable in production environment.

Has anyone noticed this problem before? Any solution I can still use a standard one without hitting the performance? Any risk to use an obsolete method (could be removed in future version)?

Thanks guys!

Edit:

Probably I found where the problem is... The default iteration count number for PasswordDeriveBytes is 100, while for Rfc2898DeriveBytes is 1000. After I changed them to the same number as 1000, executing Rfc2898DeriveBytes is only double time.

JYelton
  • 35,664
  • 27
  • 132
  • 191
tshao
  • 1,127
  • 2
  • 8
  • 23
  • How often are you going to be deriving keys in a production environment? And, regarding your timing data, when you said "100 iterations" - is that iterations on the onee key, or did you generate 100 keys. Any perf data based on 100 trials is suspect, but I think you actually tested ONE trial. As in all perf analysis cases, it is simly not appropriate to draw conclusions about server performance based on the response time of a single trial. – Cheeso Sep 01 '09 at 02:05
  • @Cheeso The test was just a unit test of the performance for these two classes and it was not done in a real app. The "100 iterations" I mentioned was a little bit confusing, that only means I executed each of them 100 times. That's not a real perf testing but just a comparison. – tshao Sep 01 '09 at 03:41
  • 5
    I think you may have missed the point `Rfc2898DeriveBytes` is fundamentally _designed_ to be slow so that password hash checks (done per log-on and hence fairly infrequently) don't notice the performance hit while brute force attacks do. If you need to generate loads of hashes `Rfc2898DeriveBytes` isn't for you, but if you need some security from brute force attacks it is. – Keith Aug 06 '12 at 13:17
  • While it is agreed that `Rfc2898DeriveByes` is slower by design, I'm having a hard time trusting the reliability of your performance test. Simply iterating over a chunk of code a hundred times is most certainly not the proper way to go about conducting a reliable microbenchmark. – arkon Apr 01 '15 at 05:35

3 Answers3

28

They aren't the same thing.

Rfc2898DeriveBytes is an implementation of PBKDF2. PasswordDeriveBytes is an implementation of PBKDF1. PBKDF2 generates a different output, using a different method, and a much larger number of rounds than PBKDF1.

Password hashing functions, such as these, which are used for key derivation are supposed to be slow. That's the point - it makes them much more difficult to crack.

The two functions are not compatible, and PasswordDeriveBytes is not nearly as secure.

Cheeso
  • 189,189
  • 101
  • 473
  • 713
BlackAura
  • 3,208
  • 1
  • 18
  • 7
  • Thanks BlackAura. I can understand the PBKDF2 implementation ought to be slow, but isn't there any best practice to use the Rfc2989DeriveBytes class, such as how to cache/reuse the same key/IV? Running a method multiple times as slow as that in production environment is not acceptable. :P – tshao Sep 01 '09 at 01:44
  • 4
    Generally, you'd only want to use these functions to generate a key from a password. Usually for something like a password-protected archive, or similar. To encrpyt an archive, you'd generate a random IV, and generate a key from the password and the IV. You store the IV (but never the key). You can then re-use the key for every file in the archive, as long as each file is encrypted with a different IV (also stored in the archive). The only other use for these functions is password hashing. You'd do this once, when a user logs in. For any other use, there's probably a better way. – BlackAura Sep 01 '09 at 16:08
11

I think you are missing the point of derivebytes. It is supposed to be slow. It intentionally uses slow algorithm which cannot be sped up by clever trick. The typical "number of iterations" parameter should be in 2^16-2^20 range and introduce a 0.1-0.5 second delay between user entering password and the key is generated. The intention is to defend against weak passwords selected by "lazy ignorant users" and slow down brute force search.

nsg
  • 111
  • 1
  • 2
10

This blogpost talks about the differences between the two: http://blogs.msdn.com/shawnfa/archive/2004/04/14/generating-a-key-from-a-password.aspx

Yannick Motton
  • 34,761
  • 4
  • 39
  • 55