We have four different Imaging options in WX2
Random – even round robin distribution (default)
Hashed – placed onto RAM stores according to key
Partial Hashed – as hashed but handles skewing attributes
Replicated – complete copy on each RAM Store
Replication puts a copy of the image on every RAM Store. It can be costly in terms of RAM and redistribution time. Good for small lookup/dimension tables
It cannot be fragmented. It is required for Theta joins. Replication is per RAM Store and not per node.
Hashing distributes the rows of a table or view image across the RAM Stores. It is dependent upon the value of one or more columns. It is good for joining large tables – hash on common key. It may lead to skewing. The number of distinct values is less than number of RAM Stores. One or two values greatly exceed the others in frequency. Partial distribution may be used to neutralize value skew
Partial hashing is a mechanism to handle joins when a large table is severely skewed on key column(s). It is an alternative to straightforward hashing. Types are Partial hashed/random RAM stores and Partial hashed/replicated across RAM stores