3

I wonder why java.util.Hashtable methods perform operations on elements array through locally assigned variable, instead of accessing class member directly. Does it has something to do with synchronization (keeping elements array in consistent state between method calls?)

For example : Entry<?,?> tab[] = table;

in

 private void addEntry(int hash, K key, V value, int index) {
420         modCount++;
421 
422         Entry<?,?> tab[] = table;
423         if (count >= threshold) {
424             // Rehash the table if the threshold is exceeded
425             rehash();
426 
427             tab = table;
428             hash = key.hashCode();
429             index = (hash & 0x7FFFFFFF) % tab.length;
430         } 

or in

456     public synchronized V put(K key, V value) {
457         // Make sure the value is not null
458         if (value == null) {
459             throw new NullPointerException();
460         }
461 
462         // Makes sure the key is not already in the hashtable.
463         Entry<?,?> tab[] = table;
464         int hash = key.hashCode();
465         int index = (hash & 0x7FFFFFFF) % tab.length;
466         @SuppressWarnings("unchecked")
467         Entry<K,V> entry = (Entry<K,V>)tab[index];
468         for(; entry != null ; entry = entry.next) {
469             if ((entry.hash == hash) && entry.key.equals(key)) {
470                 V old = entry.value;
471                 entry.value = value;
472                 return old;
473             }
474         }
475 
476         addEntry(hash, key, value, index);
477         return null;
478     }
joe.d
  • 343
  • 5
  • 16
  • 1
    `rehash()` is `protected` but not `synchronized`, so an instance of a child class could invoke it concurrently to another of those methods, so this avoids a race condition on `table` (the doc says it reorganizes the internal table and also increases capacity, so that would be a reason to allow subclasses to call this method). – Hugues M. Aug 13 '17 at 22:00
  • I am sorry, I did not quote get what is the connection with local variable reassignment in this case. – joe.d Aug 13 '17 at 22:17
  • 1
    @joe.d the idea is that in some subclass, `rehash` which changes what the ivar `table` points to might be called while some of the code you're looking at is running. The assignment to a local var is to avoid the `table` reference from suddenly changing under you unexpectedly. – pvg Aug 13 '17 at 22:25
  • @pvg what about `get` method, it does not call `rehash()` – joe.d Aug 13 '17 at 22:27
  • Code executes `get()` from thread A, other code in thread B executes `rehash()` at the same time. That other code would be a child class because that's the only case (that I saw) where `table` can be touched concurrently to other methods (because all others accesses are done via a `synchronized` method). This was already like that in Java 1.1 (just verified, but could not find SCM comments as I was hoping to get hints about the intention) – Hugues M. Aug 13 '17 at 22:44
  • @joe.d it doesn't but there is nothing stopping some subclass from exposing concurrent access to `rehash`. This is to avoid a problem that a potential subclass might introduce. – pvg Aug 13 '17 at 22:44
  • 2
    @HuguesM. heh, I just went looking at 1.1 too and I'm pretty sure it didn't change much from 1.0. A somewhat broader answer is that this is basically an early 90s design that has just had a long life and it shows. It's a class that tries to be both threadsafe *and* extensible through subclassing at the same time, something the class library avoids in later design iterations. – pvg Aug 13 '17 at 22:48
  • Thank you all... – joe.d Aug 13 '17 at 22:53
  • @HuguesM. If you don't mind I have another question : Hashtable docs says : Note that the hash table is open. What do they mean by open? I believe it should be considered as closed (open/closed addressing) because it allows more than one record to be stored at a hash location. – joe.d Aug 13 '17 at 22:59
  • I have to admit I don't know what this word means in this sentence. – Hugues M. Aug 13 '17 at 23:08
  • Thanks anyway... – joe.d Aug 13 '17 at 23:10
  • @joe.d that's a separate question that's answered here https://stackoverflow.com/questions/9124331/meaning-of-open-hashing-and-closed-hashing – pvg Aug 14 '17 at 06:43

2 Answers2

2

I feel bad leaving only comments, this question deserves a proper answer, so here's a quick attempt to rehash (ha) those comments.

  • Hashtable is synchronized (all public methods are synchronized, or return a synchronized collection (keySet(), entrySet(), values())), which makes it thread safe(Some rules & restrictions may apply)
  • The rehash() method is protected: it can be invoked from a subclass.
  • But it is not synchronized, so an instance of a subclass could invoke rehash() concurrently to another of those methods that need to access table.
    So, to avoid a race condition on table, those methods save a local copy of the reference to the array, and then can safely work with that local array: if rehash() is called, it will build a new array without interfering with other threads working on the old one.

Version from JDK 1.0.2 was already like that (found it here, the Windows .exe is self-extractible zip file, so unzip deals with that, and in there you'll find src.zip -- interesting to see Hashtable.java with 3 classes inside, no inner classes as those were introduced in 1.1).

This design encourages inheriting Hashtable to benefit from its functionality, but composition should be favored instead. I found a bug entry that is specific to subclasses of Hashtable (although not about thread safety), in which Josh Bloch says:

This sort of problem can be avoided by using delegation rather than subclassing.

Quoting @pvg in comments, for a nice summary:

A somewhat broader answer is that this is basically an early 90s design that has just had a long life and it shows. It's a class that tries to be both threadsafe and extensible through subclassing at the same time, something the class library avoids in later design iterations.

Hugues M.
  • 19,846
  • 6
  • 37
  • 65
-1

That is quite simple - rehash() will rewrite table field, so they save it to a temporary variable to access later.

Eugen Martynov
  • 19,888
  • 10
  • 61
  • 114
  • `Entry,?> tab[] = table;` appears in other methods as well, which do not call rehashing function, for example method `get`. Lines 457 - 474 do not call rehashing as well, and `addEntry` will itself call `rehash()` – joe.d Aug 13 '17 at 21:43
  • 1
    This answer is actually correct, just a little light on details ;) (+1) – Hugues M. Aug 14 '17 at 10:48