Unsigned Integer Implementation - Porting C# to Java

Question

I'm porting several thousand lines of cryptographic C# functions to a Java project. The C# code extensively uses unsigned values and bitwise operations.

I am aware of the necessary Java work-arounds to support unsigned values. However, it would be much more convenient if there were implementations of unsigned 32bit and 64bit Integers that I could drop into my code. Please link to such a library.

Quick google queries reveal several that are part of commercial applications:

By the way, I forgot to ask: What feature exactly are you looking for in these libraries that can't be done with something similar to what I wrote below? — user541686, Apr 10 '11 at 03:19
@Mehrdad My misconceptions about Java may also have distorted what I assumed possible. I'll probably accept Thomas Pornin's answer for consolidating the common techniques - even though they are exactly what I was trying to avoid. — Kyle Ivey, Apr 12 '11 at 07:46
There honestly isn't much "expertise" involved in bitwise arithmetic; you can really do it yourself. Clever in what way, though? If you mean speed, that's one thing, but other than that, I think the issue is correctness, not cleverness. (And also, obviously accept Thomas's answer if you found that more helpful! :] ) — user541686, Apr 12 '11 at 07:46

user541686 · Answer 1 · 2011-04-10T02:05:15.137

2

This is a language feature, not a library feature, so there is no way to extend Java to support this functionality unless you change the language itself, in which case you'd need to make your own compiler.

However, if you need unsigned right-shifts, Java supports the >>> operator which works like the >> operator for unsigned types.

You can, however, make your own methods to perform arithmetic with signed types as though they were unsigned; this should work, for example:

static int multiplyUnsigned(int a, int b)
{
    final bool highBitA = a < 0,    highBitB = b < 0;
    final long a2 = a & ~(1 << 31), b2 = b & ~(1 << 31);
    final long result = (highBitA ? a2 | (1 << 31) : a2)
                      * (highBitB ? b2 | (1 << 31) : b2);
    return (int)result;
}

Edit:

Thanks to @Ben's comment, we can simplify this:

static int multiplyUnsigned(int a, int b)
{
    final long mask = (1L << 32) - 1;
    return (int)((a & mask) * (b & mask));
}

Neither of these methods works, though, for the long type. You'd have to cast to a double, negate, multiply, and cast it back again in that case, which would likely kill any and all of your optimizations.

edited Apr 10 '11 at 02:05

answered Apr 10 '11 at 01:46

user541686

205,094
128
528
886

Thank you for your response, but I'm well aware of the `>>>` operator. I really would expect that it would be possible to implement and unsigned integer object within Java, including the necessary operator overloads. The docs for commercial projects such as [this](http://www.teamdev.com/downloads/jniwrapper/javadoc/com/jniwrapper/UInt64.html) give me hope. – Kyle Ivey Apr 10 '11 at 01:49
I don't think that multiply function returns anything resembling correct results when the high bit is set. – Ben Voigt Apr 10 '11 at 01:55
@Karl: Java does not support operator overloading. And for cryptographic purposes, *objects* would be very slow; you'd need primitive unsigned types, which do not exist in Java. – user541686 Apr 10 '11 at 01:55
@Ben: Yeah you were right; I edited it, this one should work. – user541686 Apr 10 '11 at 01:56
@Mehrdad: If you have a wider integral type, you can do simply `long mask = (1L << 32) - 1; return (a & mask) * (b & mask);` But that doesn't generalize to the widest integral type. – Ben Voigt Apr 10 '11 at 02:05
@Ben: Oooh that's clever! Yeah that seems like it would work, nice! :) – user541686 Apr 10 '11 at 02:06
@Mehrdad Computation speed is not so much of an issue here if an object-based approach would be considerably more convenient. Today I learned that Java doesn't support operator overloading - yikes. But as the commercial implementations show, the `Number` class handles this. All I'm really looking for is a freely available wrapper that does the same. – Kyle Ivey Apr 10 '11 at 02:10
@Mehrdad, that multiply is unnecessary. Java integral types are specified to be 2's complement. Your `multiply` is correct only if it produces the same result as `a * b`. – rlibby Apr 10 '11 at 02:11
@Karl: Ah, in that case, someone else might know of a library to help, if you won't be disappointed by the speed. – user541686 Apr 10 '11 at 02:11
@Mehrdad, then look up how to implement multiplication of unsigned numbers and how to implement multiplication of two's complement numbers. You will see they are the same. There is no need for a special method to multiply two two's complement numbers as if they were unsigned number--this is what happens anyway. Same goes for addition. – rlibby Apr 10 '11 at 02:15
@rlibby: I believe you're mistaken. Multiplication is different; addition is the same. That's why x86 has an `imul` instruction as well as a `mul` instruction, but it only has a single `add` instruction for both signed and unsigned. – user541686 Apr 10 '11 at 02:16
@Mehrdad, wrong. That's not the reason for the existence of both `mul` and `imul`. See [here](http://stackoverflow.com/questions/4039378/x86-mul-instruction-from-vs-2008-2010) and note that gcc generates the same code for `x*y` as `(unsigned int)x*(unsigned int)y` (it uses `imull`). – rlibby Apr 10 '11 at 02:58
@Mehrdad, your point is wrong. If you still believe it is correct, please provide a counterexample. If you're interested in seeing why it's wrong, follow my first suggestion and see how multiplication is actually implemented in two's complement, i.e. the same as unsigned multiplication. – rlibby Apr 10 '11 at 03:06
@rlibby: Maybe I'm mistaken, but what do you get when you do `110 * 011` in binary, assuming you're working with three-bit integers (and so that your result will be truncated)? You get `010 010` which is truncated to `010`, right? If multiplication worked the same way for positive and negative numbers, that would mean you'd get `-2 * 3 == 2`, which is clearly incorrect... – user541686 Apr 10 '11 at 03:20
@Mehrdad, actually, it is correct. 110 * 011 = 010 after truncating. If you were thinking of it as signed, this would be -2 * 3 = -6. -6 in 6 bits would be written 111010, truncated to 010, yes, 2. If you were thinking of it as unsigned, this would be 6 * 3 = 18. 18 in 6 bits would be 010010, again truncated to 010. – rlibby Apr 10 '11 at 03:35
@rlibby: Yeah, you're right; I stand corrected. :-) Thanks for taking the time to point it out! – user541686 Apr 10 '11 at 03:46

score 2 · Accepted Answer · answered Apr 11 '11 at 14:16

Operations with signed and unsigned integers are mostly identical, when using two's complement notation, which is what Java does. What this means is that if you have two 32-bit words a and b and want to compute their sum a+b, the same internal operation will produce the right answer regardless of whether you consider the words as being signed or unsigned. This will work properly for additions, subtractions, and multiplications.

The operations which must be sign-aware include:

Right shifts: a signed right shift duplicates the sign bit, while an unsigned right shift always inserts zeros. Java provides the ">>>" operator for unsigned right-shifting.
Divisions: an unsigned division is distinct from a signed division. When using 32-bit integers, you can convert the values to the 64-bit long type ("x & 0xFFFFFFFFL" does the "unsigned conversion" trick).
Comparisons: if you want to compare a with b as two 32-bit unsigned words, then you have two standard idioms:

if ((a + Integer.MIN_VALUE) < (b + Integer.MIN_VALUE)) { ... }

if ((a & 0xFFFFFFFFL) < (b & 0xFFFFFFFFL)) { ... }

Knowing that, the signed Java types are not a big hassle for cryptographic code. I have implemented many cryptographic primitives in Java, and the signed types are not an issue provided that you understand what you are writing. For instance, have a look at sphlib: this is an opensource library which implements many cryptographic hash functions, both in C and in Java. The Java code uses Java's signed types (int, long...) quite seamlessly, and it simply works.

Java does not have operator overloading, so Java-only "solutions" to get unsigned types will involve custom classes (such as the UInt64 class you link to), which will imply a massive performance penalty. You really do not want to do that.

Theoretically, one could define a Java-like language with unsigned types and implement a compiler which produces bytecode for the JVM (internally using the tricks I detail above for shifts, divisions and comparisons). I am not aware of any available tool which does that; and, as I said above, Java's signed types are just fine for cryptographic code (in other words, if you have trouble with such signed types, then I daresay that you do not know enough to implement cryptographic code securely, and you should refrain from doing so; instead, use existing opensource libraries).

Unsigned Integer Implementation - Porting C# to Java

2 Answers2