16

I know that more-dynamic-than-Java languages, like Python and Ruby, often allow you to place objects of mixed types in arrays, like so:

["hello", 120, ["world"]]

What I don't understand is why you would ever use a feature like this. If I want to store heterogenous data in Java, I'll usually create an object for it.

For example, say a User has int ID and String name. While I see that in Python/Ruby/PHP you could do something like this:

[["John Smith", 000], ["Smith John", 001], ...]

this seems a bit less safe/OO than creating a class User with attributes ID and name and then having your array:

[<User: name="John Smith", id=000>, <User: name="Smith John", id=001>, ...]

where those <User ...> things represent User objects.

Is there reason to use the former over the latter in languages that support it? Or is there some bigger reason to use heterogenous arrays?

N.B. I am not talking about arrays that include different objects that all implement the same interface or inherit from the same parent, e.g.:

class Square extends Shape
class Triangle extends Shape
[new Square(), new Triangle()]

because that is, to the programmer at least, still a homogenous array as you'll be doing the same thing with each shape (e.g., calling the draw() method), only the methods commonly defined between the two.

Aaron Yodaiken
  • 19,163
  • 32
  • 103
  • 184
  • 2
    Python or Ruby are dynamically typed, so heterogenous arrays are "free" to implement. – Etienne de Martel Dec 26 '10 at 16:50
  • 1
    "this seems a bit less safe"? What do you mean by "safe"? Please note that "type safety" is an absolute guarantee in Python, since an object's type is coupled with the value in a way that can't be undone by something like a cast operation. Please define what you mean by "safe" in this question. – S.Lott Dec 26 '10 at 17:00
  • 6
    One kind of safety is that coding mistakes are detected at compile-time. And this is something dynamically typed languages can't provide for many kinds of mistakes. Of course you gain flexibility in return. – CodesInChaos Dec 26 '10 at 17:09
  • Safety, like I know that every element has a name and an ID via constructors—some part of my program won't stick a `["Smith John", 000, "email@example.com"` accidentally either without me knowing. – Aaron Yodaiken Dec 26 '10 at 17:31
  • 1
    How would the program stick something like that in unless you told it to do that in the first place? – the Tin Man Dec 26 '10 at 21:43
  • @aharon "some part of my program won't stick a `["Smith John", 000, "email@example.com"` accidentally either without me knowing" -- so you never make mistakes? You and everyone who works on this code with you always call functions in exactly the way they're supposed to be called, with all the arguments in exactly the right order? I can't believe that's what you're suggesting, but that is what it sounds like, and I'm not sure what you *are* saying. – Tyler Dec 27 '10 at 03:31
  • 2
    No—I'm saying that because I do make mistakes, being able to have objects with a constructor in the form `( (String) name, (int) id )` allows the language to send an error when someone codes something incorrectly, e.g. `name, id, email`. – Aaron Yodaiken Dec 27 '10 at 08:39
  • @aharon: "allows the language to send an error when someone codes something incorrectly" Not really. You could easily create a class with a proper constructor that still behaved badly because a method was implemented incorrectly. – S.Lott Dec 27 '10 at 15:08

8 Answers8

4

Applying a multimethod to the array might make some sense. You switch the strategy to a more functional style in which you focus on a discrete piece of logic (i.e. the multimethod) instead of a discrete piece of data (i.e. the array objects).

In your shapes example, this prevents you from having to define and implement the Shape interface. (Yes, it's not a big deal here, but what if shape was one of several superclasses you wanted to extend? In Java, you're SOL at this point.) Instead, you implement a smart draw() multimethod that first examines the argument and then dispatches to the proper drawing functionality or error handling if the object isn't drawable.

Comparisons between functional and object-oriented styles are all over the place; here are a couple relevant questions that should provide a good start: Functional programming vs Object Oriented programming and Explaining functional programming to object-oriented programmers and less technical people.

Community
  • 1
  • 1
G__
  • 7,003
  • 5
  • 36
  • 54
  • @aharon The last paragraph regarding functional vs object-oriented programming is for the general benefit of whoever is reading this answer down the road. I'm not implying that you aren't aware of the differences. – G__ Dec 26 '10 at 17:13
4

As katrielalex wrote: There is no reason not to support heterogeneous lists. In fact, disallowing it would require static typing, and we're back to that old debate. But let's refrain from doing so and instead answer the "why would you use that" part...

To be honest, it is not used that much -- if we make use of the exception in your last paragraph and choose a more liberal definition of "implement the same interface" than e.g. Java or C#. Nearly all of my iterable-crunching code expects all items to implement some interface. Of course it does, otheriwise it could do very little to it!

Don't get me wrong, there are absolutely valid use cases - there's rarely a good reason to write a whole class for containing some data (and even if you add some callables, functional programming sometimes comes to the rescue). A dict would be a more common choice though, and namedtuple is very neat as well. But they are less common than you seem to think, and they are used with thought and discipline, not for cowboy coding.

(Also, you "User as nested list" example is not a good one - since the inner lists are fixed-sized, you better use tuples and that makes it valid even in Haskell (type would be [(String, Integer)]))

  • 1
    What are these 'absolutely valid use cases'? – Aaron Yodaiken Jan 01 '11 at 16:12
  • @aharon: Pretty much every time they work and are less of a hassle than the alternatives. For example, dicts with heterogeneous values (the question is about arrays, but really any collection can be heterogeneous) make good lightweight classes. Heterogeneous tuples/lists are less self-documenting and therefore not quite as advisable, but also valid depending on the data (e.g. too little data used in too little places to make a dict notably superior). –  Jan 01 '11 at 16:22
  • I know it's years later, but: for me, a common use-case in JavaScript is taking `arguments`, converting it to a (heterogeneous) array, popping/shifting some arguments, then using `.apply()`. I've used this a lot when writing event-handling libraries (`.on()`, `.off()`), or writing functions which wrap/lift/modify other functions. – cloudfeet Oct 08 '18 at 10:52
3

Is there reason to use the former over the latter in languages that support it?

Yes, there is a very simple reason why you can do this in Python (and i assume the same reason in Ruby):

How would you check that a list is heterogenous?

  • It can't just compare the types directly because Python has duck typing.
  • If all the object have some common typeclass Python has no way to guess that either. Everything supports being represented anyways, so you should be able to put them in a list together too.
  • It wouldn't make any sense to turn lists into the only type that needs a type declaration either.

There is simply no way to prevent you from creating a heterogenous list!

Or is there some bigger reason to use heterogenous arrays?

No, I can't think of any. As you already mentioned in your question, if you use a heterogenous arrays you're just making things harder than they have to be.

Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • Come on, it is very simple to implement. For ex, the constructor of the list class could require the type, so something like `arr = list(type = int)`. This information would be stored inside the list. Down the road, if you try something like `arr.append("hello")` it raises an exception. Simple as that! – Aykhan Hagverdili May 08 '23 at 21:51
  • Here you go: https://python.godbolt.org/z/Yq6Kc3xbo – Aykhan Hagverdili May 08 '23 at 22:07
2

There is no reason not to support heterogeneous lists. It's a limitation for technical reasons, and we don't like those.

Not everything needs to be a class!

In Python, a class is basically a souped up dictionary with some extra stuff anyway. So making a class User is not necessarily any clearer than a dictionary {"name": ..., "id": ...}.

Katriel
  • 120,462
  • 19
  • 136
  • 170
  • But the dictionary cannot have extra methods if I want to extend them as such later, right? – Aaron Yodaiken Dec 26 '10 at 16:55
  • No, but if you want extra methods you can subclass `dict`. And this is extensible; you can mix and match dictionaries with `FunkyDictionaries` and custom mappings, as long as when you run the code all the methods that you require are there. Note also that you can have global functions (not attached to a class), so for instance you can define a function `split_name = lambda d: d['name'].split()` to return the first and last names, split by whitespace. And this function doesn't have to be associated with the `dict` class. – Katriel Dec 26 '10 at 16:56
  • 2
    @aharon: If you want custom methods, i.e. real *behaviour*, a class is warranted. But not if you just want to structure some data and apply transformations to it - for these (very common) cases, a class adds zero benefit and too many extra lines. –  Dec 26 '10 at 17:14
2

There is nothing to stop you having a heterogeneous array in Java. It is considered poor programming style and using proper POJOs will be faster/more efficient than heterogeneous arrays in Java or any other language as the types of the "fields" are statically known and primitives can be used.

In Java you can

Object[][] array = {{"John Smith", 000}, {"Smith John", 001}, ...};
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
0

In Lua an object and an array are the same thing so the reason is more clear. Let's say that Lua takes the weak typing to the extreme

Apart from that, I had a Google Map object and I needed to delete all markers created so far in that map. So I ended up creating an array for markers, an array for circles and an array for places. Then I made a function to iterate over those three arrays and call .remove() on each of them. I then realized that I could just have a single non homogeneous array and insert into them all the objects and iterate once over that array

Gianluca Ghettini
  • 11,129
  • 19
  • 93
  • 159
0

Eterogenous lists are very useful. For instance, to make the game of snake, I can have a list of blocks like this: [[x, y, 'down'], [x1, y1, 'down']] instead of a class for the blocks, and I can access faster to every element.

pythonFoo
  • 2,194
  • 3
  • 15
  • 17
  • 1
    It's still a homogeneous list of (int, int, string) tuples. – Jochen Ritzel Dec 26 '10 at 17:06
  • 1
    but int, int, string is heterogeneous. – pythonFoo Dec 26 '10 at 17:10
  • 2
    I wouldn't conceptually regard your inner list as a heterogeneous list but as a 3-tuple. But of course in dynamically typed languages it is backed by the same type. But in statically typed languages you can achieve the same with tuples without losing (compile-time) type-safety. So it's not a good example to show where heterogeneous lists are useful beyond what is provided in statically typed languages. – CodesInChaos Dec 26 '10 at 17:14
  • 3
    "and I can access faster to every element." [citation needed] – Tyler Dec 26 '10 at 17:18
  • 1
    @MatrixFrog: http://pastebin.com/10DCvtuQ this little snippet, on my computer, outputs: 6.58470416069 6.20616889 – pythonFoo Dec 26 '10 at 17:47
  • @pythonFoo. I admit I am surprised. I thought maybe there was some kind of weird optimization going on so I modified it slightly: https://gist.github.com/755616 and even with that change, the array access is still faster. – Tyler Dec 26 '10 at 20:51
  • I assume this is because objects (which are basically just dicts) are designed to have fast access when there are lots of fields, but they didn't spend much time worrying about the case where there are only a couple fields, so there's a little bit of overhead that slows you down in that case. – Tyler Dec 26 '10 at 20:55
-1

Here is a simple answer:

N.B. I am not talking about arrays that include different objects that all implement the same interface or inherit from the same parent, e.g.:

Everything extends java.lang.Object... and that's plenty. There is no reason not to have Object[] and put anything you like in. Object[] are exceptionally useful in any middleware like persistence layer.

xss
  • 31
  • 1
  • -1. If you're using Java, you might as well take advantage of its compiler-time type-checking. Any code that uses `Object` or `Object[]` immediately raises a red flag in my mind. – Tyler Dec 26 '10 at 20:56