1

Having returned to development after an absence of over a decade I am getting myself up to speed with the latest technologies for web development. Reading this post I see that I already understood the difference between hashes and arrays.

However, doesn't this mean that arrays are just a type of hash that uses a numerical key? As there is no reason to believe that an implementation of an array will automatically maintain the sequential nature of the array indices (when you delete or insert items for example), is there any greater difference than the inherent ordering of an array?

I mean, to step through an array, you need to set up a loop through the indices, the same as looping through the keys of a hash, and then you could order the numerical hash key set to behave the same (i.e. access the items from 1 to the last number that is a key in the hash in numerical sequence). To access an array element, you use the indices of the value you want, the same as giving the numerical key from the hash.

I came to this question while learning about arrays and hashes in Ruby on Rails, but it is a general question.

Community
  • 1
  • 1
Russell Ormes
  • 525
  • 8
  • 22

2 Answers2

0

A Hash is essentially an array. Hash keys have some type of conversion function to translate an object of another type (or an integer set of other values) into the integer index of an array. Conversely, an array is a Hash that does not translate the keys into a separate type or value of index. However, calling an array a Hash implies an extra layer of functionality that does not exist, since there is no key conversion.

By definition, the objects of an array are stored in consecutive locations in memory, accessible by index.

Even when either type of data structure can be used, one benefit of using hashes with Integer keys is that a larger spread of integers can be stored in a smaller number of buckets. E.g., if your number keys are 5 integers near 1, 10, 100, 1000 and 10000 you don't need 10K buckets to have a hash of these 5 elements, but you do need that many if you use a straight-up array. Hash functions tend to be recalculated and more memory reallocated as the hash grows and the benefit of using an array is that its size can be more easily controlled and can remain fixed.

La-comadreja
  • 5,627
  • 11
  • 36
  • 64
  • 1
    Okay, so the benefit of having separate implementations of arrays and hashes is one of complexity? That is to say, sometimes an array does everything needed, so using a hash with a numerical key is going from array, to hash, back to array (due to the point you made about conversion of keys to integers), back to a hash, which you then force to behave as an array! Should I think of hashes as adding some functionality to the array data structure rather that an array being a special case of a hash data structure? – Russell Ormes Feb 20 '14 at 19:44
  • Just changed my answer. Yes, it is less efficient to use a Hash with a numerical key if the keys are more or less consecutive integers. The hash function and rehashing can be extra overhead. I like to think of hashes as a translation between other types, including objects, and array indices. You don't always need the translation and when you don't, it's counterproductive. – La-comadreja Feb 20 '14 at 19:53
  • 1
    Ok, thanks. Although they seem to be the same or very similar, I think they are, in fact, different beasts. It seems to me that whenever it is appropriate an array should be preferred. Hashes are for different situations, when the goal is not simply to store a range of data, but when it is important what the array key is. Thanks for the explanation @La-comadreja. – Russell Ormes Feb 20 '14 at 19:56
0

Here is how declarative programming defines it.
Difference between Declarative and Procedural Programming?
http://en.wikipedia.org/wiki/Procedural_programming
http://en.wikipedia.org/wiki/Declarative_programming

There are primitive, composite, and abstract data structures.
- An Array is composite.
- A Hash is abstract.

We have both because they are fundamentally different. For example you can't pop/push primitives into a Hash like you can with an Array, because a Hash uses unordered key-values while an Array has an index.

http://en.m.wikipedia.org/wiki/List_of_data_structures

Community
  • 1
  • 1
wurde
  • 2,487
  • 2
  • 20
  • 39
  • Thanks for the link, but I am not sure that is correct. An abstract data type is a description of a data type given solely on the functionality. As you can describe an array mathematically as a datatype of two actions (get(A,i) and set(A,i,V)) such that get returns the element at position 'i' and set sets the element at position 'i' to 'V' such that get(set(A,i,V),i) = V, an array can also be an abstract data type. I take the point that they are different data types, as mentioned in the comments above, but I dispute that it is because they belong to different categories. – Russell Ormes Feb 26 '14 at 16:29
  • You would be correct if we were talking procedural languages, in the context of data structures and declarative programming an abstract is not based on functionality but definition. – wurde Feb 26 '14 at 16:35
  • A great explanation of this on SO: http://stackoverflow.com/questions/1619834/difference-between-declarative-and-procedural-programming – wurde Feb 26 '14 at 16:36
  • @NectarSoft Let me apologize, you and I are both correct. – wurde Feb 26 '14 at 17:28
  • Sorry, wurde, I am not sure what point your making. Data structures in the context you used them in your answer are independent of programming languages. – Russell Ormes Feb 26 '14 at 20:54
  • Ok. Describing a datatype as 'two actions' signals you think procedurally. The world around you is a sea of expressions, and what isn't immediately obvious to you are the transient states. – wurde Feb 26 '14 at 21:06