This question is slightly related to these two questions, but with these two differences: 1) I want to know how to hook specific Intel instructions from the JVM (hopefully via existing library) 2) I don't care about one large file, but millions of short (< 50 characters) String and Number objects.
I noticed that Intel provides native extensions (https://software.intel.com/en-us/articles/intel-sha-extensions) for creating SHA256 hashes. Is there any existing library in Java that can hook these native extensions? Is there a JVM implementation that natively hooks these extensions?
Is there a different implementation I should choose for millions of small String and Number values over a single giant file?
As a test, I tried 5 different hashing algorithms: Java built-in, Groovy built-in, Apache Commons, Guava, and Bouncy Castle. Only Apache and Guava seemed to push beyond 1 million hashes/sec on my Intel i5 hardware.
>groovy hash_comp.groovy
Hashing 1000000 iterations of SHA-256
time java: 2968 336927.2237196765 hashes/sec
time groovy: 2451 407996.7360261118 hashes/sec
time apache: 1025 975609.7560975610 hashes/sec
time guava: 901 1109877.9134295228 hashes/sec
time bouncy: 1969 507872.0162519045 hashes/sec
>groovy hash_comp.groovy
Hashing 1000000 iterations of SHA-256
time java: 2688 372023.8095238095 hashes/sec
time groovy: 1948 513347.0225872690 hashes/sec
time apache: 867 1153402.5374855825 hashes/sec
time guava: 953 1049317.9433368311 hashes/sec
time bouncy: 1890 529100.5291005291 hashes/sec
When I ran 10 times in a row, Apache Commons hashing was the consistent winner when hashing 1 million strings (it won 9/10 times). My test code is available here.
The question remains, is there a way to tap into the Intel SHA hashing extensions from the JVM?
UPDATE
As @MJM suggested in the comments, I have removed the String functions and tested purely on byte[] to byte[]. Here are sample results:
Hashing 1000000 iterations of SHA-256
time java: 674 1483679.5252225519 hashes/sec
time apache: 833 1200480.1920768307 hashes/sec
time guava: 705 1418439.7163120567 hashes/sec
time bouncy: 692 1445086.7052023121 hashes/sec