21

We have three web applications (standard Spring MVC-Hibernate) running within a Jboss server 6.1. All three applications share a common authentication method which is compiled as a JAR and included within each WAR file. Our authentication method uses org.springframework.security.crypto.bcrypt.BCrypt to hash user passwords, please see below:

hashedPassword.equals(BCrypt.hashpw(plainTextPassword, salt));

JBOSS StartUp Options

set "JAVA_OPTS=-Xms2048m -Xmx4096m -XX:PermSize=256m -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -verbosegc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:gc.txt -XX:+UseParallelOldGC

Problem: It appears that when the server is restarted, the Bcrypt.hashpw takes 100ms to decrypt password. However after some time (there is no pattern) suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds. There is no obvious reason for this.

More information:

  • Hibernate Version: 4.2.4.Final
  • Spring Version: 4.0.5.RELEASE Spring
  • Security Version: 3.2.4.RELEASE

Has anyone else seen this problem before?

George Artemiou
  • 3,076
  • 2
  • 19
  • 36
  • 1
    Attach a profiler and figure out which part is taking so long. – M. Deinum Apr 11 '16 at 09:34
  • Hello @m-deinum, we have already done this and found out that the issue is with the library itself. That does not explain why the library works fine for some time and then it spikes up (and stays like that) to 10 seconds or more. – George Artemiou Apr 11 '16 at 09:45
  • Which library and which part of the library... The fact that it is *the library* doesn't clarify things, you might want to be a bit more specific as to which part of the library. Method etc. – M. Deinum Apr 11 '16 at 10:12
  • Hello @m-deinum, as I said on my question's description, we have identified an issue with method hashpw of library BCrypt which is part of Spring Security. – George Artemiou Apr 11 '16 at 12:18
  • That method uses a lot of other methods, what I requested is use a profiler to see what part of that method is taking that time. That is the interesting part and the path to the solution. So a bit more info, detail on which part of the method or which specific call inside that method is taking time is worthwhile. – M. Deinum Apr 11 '16 at 12:20
  • The issue we are facing is that we don't know the steps to reproduce this issue as it's completely random. When we restart the server, the performance of that method goes back to normal but after some time, it creeps back so even investigating this issue using a profiler is very hard. – George Artemiou Apr 12 '16 at 09:02
  • Is BCrypt.hashpw the only method experiencing the slowness? It looks like that the all the operations in hashpw functions are pure arithmetic operation except some memory allocations(object cloning), it will not use the SecureRandom class or anything like that. I would suspect that there some other code somewhere else occupying too much CPU or memory. – qianfg Apr 18 '16 at 08:58
  • @GeorgeArtemiou Have you been able to find out the cause of the problem? I've faced with the same problem: call of the method `BCrypt.checkpw` might take from 100ms (that is OK) to >10 seconds! I found that this problem occurs in multithread environment and than the method `BCrypt#key` takes the most of part of execution time. `syncrhonized` solves the problem but it's not an option for our app. – Roman Proshin Feb 16 '17 at 09:52

4 Answers4

18

Problem: It appears that when the server is restarted, the Bcrypt.hashpw takes 100ms to decrypt password. However after some time (there is no pattern) suddenly the Bcrypt.hashpw performance spikes up from 100ms to 10s of seconds. There is no obvious reason for this.

The problem is /dev/random sometimes blocks and when it does it will appear to be random :) The more confusing thing is that while trying to test how it works you'll run up against the Observer Effect ie while trying to observe random behavior you're generating entropy and this can lead to a ton of confusion i.e. my results won't be the same as yours etc. This is also why it looks like there's no pattern..

I'll demonstrate the problem and show you how to recreate it (within reason) on your own servers so you can test solutions. I'll try and provide a couple of fixes, note this is on Linux but the same problem will happen on any system that requires entropy to generate random numbers and runs out.

On Linux /dev/random is a stream of random bytes. As you read from the stream you deplete the available entropy. When it reaches a certain point reads from /dev/random block. You can see available entropy using this command

cat /proc/sys/kernel/random/entropy_avail

If you run the following bash script and also monitor entropy_avail you'll notice that entropy dips dramatically as the bash script consumes it.

while :
do
  cat /dev/random > /dev/null
done

This should also give you a hint on how to recreate this problem on your servers ie run the above bash script to reduce available entropy and the problem will manifest itself.

If you want to see just how many bytes per second your system is creating you can use pv to measure it i.e.

pv /dev/random

If you leave pv running it has an effect, it's consuming the random stream of bytes which means other services might start to block. Note that pv is also displaying it's output so it might also be increasing available entroy on the system :).

On systems with little or no entropy using pv /dev/random will seem glacially slow. I've also experienced that VM's sometimes have major issues with generating entropy.

To recreate the issue use the following class...

import java.security.SecureRandom;
import org.mindrot.jbcrypt.BCrypt;
public class RandTest {
    public static void main(String[] args) {
        SecureRandom sr = new SecureRandom();
        int out = 0;
        String password = "very-strong-password-1729";
        String hashed;
        for (int i = 0; i < 200000 ; i++) {
            hashed = BCrypt.hashpw(password, BCrypt.gensalt());
            //If we print, we're generating entroy :) System.out.println(hashed);
        }
    }
}

I downloaded bcrypt to a local directory. I compiled and ran it as follows

javac -cp ./jBCrypt-0.4/src/   RandTest.java
java  -cp ./jBCrypt-0.4/src/:. RandTest

If you then run the bash script from earlier while runnng RandTest you'll see large pauses where the system is blocking waiting for more entropy. If you run strace you'll see the following...

1067 [pid 22481] open("/dev/random", O_RDONLY|O_LARGEFILE) = 12
11068 [pid 22481] fstat64(12, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
11069 [pid 22481] fcntl64(12, F_GETFD)        = 0
11070 [pid 22481] fcntl64(12, F_SETFD, FD_CLOEXEC) = 0
.....
11510 [pid 22481] read(12, "\320\244\317RB\370", 8) = 6

The program is reading from /dev/random. The problem with testing entropy is you might be generating more of it while trying to test it ie the Observer Effect.

Fixes

The first fix is to change from using /dev/random to /dev/urandom ie

time java  -Djava.security.egd=file:///dev/./urandom -cp ./jBCrypt-0.4/src/:.  RandTest

An alternative fix is to recreate the /dev/random device as a /dev/urandom device. You can find how to do this form the man page ie, instead of creating them...

mknod -m 644 /dev/random c 1 8
mknod -m 644 /dev/urandom c 1 9
chown root:root /dev/random /dev/urandom

we delete one and fake it ie

rm /dev/random
mknod -m 644 /dev/random c 1 9
chown root:root /dev/random

/dev/random is now actually /dev/urandom

The key thing to remember is testing random data that requires entroy from the system you're testing on is difficult because of the Observer Effect.

Harry
  • 11,298
  • 1
  • 29
  • 43
  • 3
    Thanks for your reponse. But our scenario is different. In short, we are not using Bcrypt.gensalt(). We do not generate salt while authenticating user. Instead we already have salt stored in db and retrieve it before using hashpw method. – George Artemiou Apr 25 '16 at 14:26
  • Bcrypt.gensalt is consuming entropy. It's getting the salt from the random stream that SecureRandom uses. – Harry May 28 '18 at 07:08
4

One possible explanation is that the SeedGenerator of SecureRandom is causing the delays.

Springs BCrypt implementation uses SecureRandom which in turn uses a SeedGenerator which in turn may use the blocking /dev/random. Here is a good description of those classes.

That bugreport also reports performance problems in BCrypt and traced them back to the seed generator, showing full stacktraces. The BCrypt implementation is different but the stacktrace below SecureRandom must be identical to the spring implementation. Their solution was to reduce the reseed frequency of BCrypt.

Community
  • 1
  • 1
wero
  • 32,544
  • 3
  • 59
  • 84
  • Thanks for your reponse. But our scenario is different. In short, we are not using Bcrypt.gensalt(). We do not generate salt while authenticating user. Instead we already have salt stored in db and retrieve it before using hashpw method. – George Artemiou Apr 25 '16 at 14:26
0

changing to urandom tag will work only on JDK8 or above, we were facing this for long time and changing to urandom in 1.7 didnt help but in 1.8 it did solve the issue.

0

For anyone stuck with the same issue, we resolved this by installing rng-tools

parser_failed
  • 697
  • 9
  • 20