3

I am implementation a distributed key-value store in Java. I need to save a timestamp for each key. Since I want to have a large of number of keys in the system, I decided to use BigInteger instead of long, but I am concerned about its efficiency.

Note that I don't have any multiplication, I only used addition and comparTo. So do you think the BigInteger is significantly less efficient than long?

It is the first time that I am trying BigInteger, is there any other concern comparing to long?

Yar
  • 7,020
  • 11
  • 49
  • 69
Mohammad Roohitavaf
  • 439
  • 2
  • 6
  • 15
  • 3
    What sort of efficiency? Speed or memory or both? Have you tested BigInteger vs. long in your actual application? – Nayuki Jul 18 '16 at 01:43
  • 4
    How many keys do you expect that you think `2^64` isn't enough? – apangin Jul 18 '16 at 01:52
  • 1
    http://stackoverflow.com/questions/31748028/long-vs-biginteger | http://stackoverflow.com/questions/12498363/performance-implications-of-using-java-biginteger-for-a-huge-bitmask | http://stackoverflow.com/questions/23603513/increasing-javas-biginteger-performance – Yar Jul 18 '16 at 01:53
  • 4
    You're not going to need more that 2^64 keys in your system. I guarantee it. – ddyer Jul 18 '16 at 01:59
  • 2
    You're using the timestamp for the key? At what precision, and what happens if multiple keys come in at the same time? Anyway, a long can fit about 585 years' worth of nanoseconds, so it's going to be more than enough. (Btw, that yields my favorite bit of JavaDoc in the JDK: a warning that calculating difference in [System.nanoTime() will overflow if your app runs for more than 292 years](https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--).) – yshavit Jul 18 '16 at 02:08
  • Have you considered using, err, a `Timestamp`? – user207421 Jul 18 '16 at 02:15
  • 1
    Or a UUID, possibly time-based? – chrylis -cautiouslyoptimistic- Jul 18 '16 at 02:25
  • 1
    Thanks for suggestions. Do we have unsigned long in java? By a long I can have 2^63-1. Note that my timestamps are lamport clocks (logical clocks), so they are basically simple counters. – Mohammad Roohitavaf Jul 18 '16 at 02:40
  • 1
    @MohammadRoohitavaf Even that gives you ~292 years' worth of nanoseconds (see my fun JavaDoc link from above). But you can always just treat the `long` as 64 bits as representing an unsigned number. Things like toStringing and inequality comparisons won't work out of the box, but Guava has a couple classes to help with that: [`UnsignedLong`](http://google.github.io/guava/releases/snapshot/api/docs/com/google/common/primitives/UnsignedLong.html) gives you a "boxed"-style instance, whereas `UnsignedLongs` (note the plural) provides static method calls that work on longs. – yshavit Jul 18 '16 at 02:45
  • 1
    Mmmm..., so if the users of my system, add 1000,000,000 keys per second, the system works fine for 292 years. Good ^_^ – Mohammad Roohitavaf Jul 18 '16 at 03:06
  • @MohammadRoohitavaf If you allow numbers to go negative, they won't duplicate/overlap for another 292 years. – Peter Lawrey Jul 18 '16 at 06:52
  • @MohammadRoohitavaf You could limit yourself to 10 million keys per second in which case you get 29,200 years. – Peter Lawrey Jul 18 '16 at 06:53

2 Answers2

5

No. BigInteger needs more memory than a long, and because it is not a primitive type, it is also slower. I would only use it when you need more digits than a long can provide.

For your purposes, a long should be sufficient (and more efficient), as far as I can tell.

Rudy Velthuis
  • 28,387
  • 5
  • 46
  • 94
4

Instant

If you want a timestamp, we already have a class for that: Instant represents a moment on the timeline in UTC with a resolution up to nanoseconds.

Instant instant = Instant.now() ;

In Java 8 the current moment is captured with a resolution up to milliseconds. In Java 9 a new implementation of Clock captures the current moment up to the full nanosecond resolution, depending on your computer clock hardware.

UUID

But if you want to identify objects across distributed systems, use the type invented for just that purpose: Universally Unique Identifier (UUID). This type is defined in official standards. Made of a 128-bit value, basically an unimaginably large number, but certain bits have certain meanings.

For human reading, a hex string is generated in a canonical format.

Java includes a UUID class to represent such values. Stored internally as a pair of 64-bit long numbers in the OpenJDK implementation as I recall.

UUID uuid = UUID.randomUUID() ;
String hex = uuid.toString() ;

BigInteger > long

And to answer your direct issue, BigInteger is designed to represent ginormous numbers, not designed for efficiency. A long primitive (64-bit number) uses much less memory since it lacks the overhead of a class & object and the machinery for representing and operating on ginormous numbers that may be much larger than fit into CPU registers. And operations on a long execute much faster.

So, when you do not need the features of BigInteger, stick with long primitive.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154