83

Say you have a very simple data structure:

(personId, name)

...and you want to store a number of these in a javascript variable. As I see it you have three options:

// a single object
var people = {
    1 : 'Joe',
    3 : 'Sam',
    8 : 'Eve'
};

// or, an array of objects
var people = [
    { id: 1, name: 'Joe'},
    { id: 3, name: 'Sam'},
    { id: 8, name: 'Eve'}
];

// or, a combination of the two
var people = {
    1 : { id: 1, name: 'Joe'},
    3 : { id: 3, name: 'Sam'},
    8 : { id: 8, name: 'Eve'}
};

The second or third option is obviously the way to go if you have (or expect that you might have) more than one "value" part to store (eg, adding in their age or something), so, for the sake of argument, let's assume that there's never ever going to be any more data values needed in this structure. Which one do you choose and why?


Edit: The example now shows the most common situation: non-sequential ids.

nickf
  • 537,072
  • 198
  • 649
  • 721
  • 1
    Also there is a 2d array: `var people = [[1, 'Joe'],[3, 'Sam'],[8,'Eve']];` (which doesn't give you fast look-up by id, but is another option for storing the data) – User Dec 30 '15 at 03:24

6 Answers6

58

Each solution has its use cases.

I think the first solution is good if you're trying to define a one-to-one relationship (such as a simple mapping), especially if you need to use the key as a lookup key.

The second solution feels the most robust to me in general, and I'd probably use it if I didn't need a fast lookup key:

  • It's self-describing, so you don't have to depend on anyone using people to know that the key is the id of the user.
  • Each object comes self-contained, which is better for passing the data elsewhere - instead of two parameters (id and name) you just pass around people.
  • This is a rare problem, but sometimes the key values may not be valid to use as keys. For example, I once wanted to map string conversions (e.g., ":" to ">"), but since ":" isn't a valid variable name I had to use the second method.
  • It's easily extensible, in case somewhere along the line you need to add more data to some (or all) users. (Sorry, I know about your "for argument's sake" but this is an important aspect.)

The third would be good if you need fast lookup time + some of the advantages listed above (passing the data around, self-describing). However, if you don't need the fast lookup time, it's a lot more cumbersome. Also, either way, you run the risk of error if the id in the object somehow varies from the id in people.

Dan Lew
  • 85,990
  • 32
  • 182
  • 176
  • 1
    on your third point there. Objects properties can be accessed in bracket notation to avoid that problem: var x = { "a.b>c++" : "foo"}; alert(x['a.b>c++']); – nickf Mar 27 '09 at 02:19
  • You do have the problem of a few reserved words with properties, I believe. But no problems when you're using numbers. – derobert Mar 27 '09 at 02:59
  • 4
    @derobert: I don't think so. x = { "delete" : 1, "for" : 2, "function" : 3}; -- this is valid and can be accessed in the same way. – nickf Mar 27 '09 at 03:06
7

Actually, there is a fourth option:

var people = ['Joe', 'Sam', 'Eve'];

since your values happen to be consecutive. (Of course, you'll have to add/subtract one --- or just put undefined as the first element).

Personally, I'd go with your (1) or (3), because those will be the quickest to look up someone by ID (O logn at worst). If you have to find id 3 in (2), you either can look it up by index (in which case my (4) is ok) or you have to search — O(n).

Clarification: I say O(logn) is the worst it could be because, AFAIK, and implementation could decide to use a balanced tree instead of a hash table. A hash table would be O(1), assuming minimal collisions.

Edit from nickf: I've since changed the example in the OP, so this answer may not make as much sense any more. Apologies.

Post-edit

Ok, post-edit, I'd pick option (3). It is extensible (easy to add new attributes), features fast lookups, and can be iterated as well. It also allows you to go from entry back to ID, should you need to.

Option (1) would be useful if (a) you need to save memory; (b) you never need to go from object back to id; (c) you will never extend the data stored (e.g., you can't add the person's last name)

Option (2) is good if you (a) need to preserve ordering; (b) need to iterate all elements; (c) do not need to look up elements by id, unless it is sorted by id (you can do a binary search in O(logn). Note, of course, if you need to keep it sorted then you'll pay a cost on insert.

derobert
  • 49,731
  • 15
  • 94
  • 124
  • I understand why (1) and (3) are faster then (2), but I'm wondering how you got O(log(n)) for the time? Shouldn't it be O(1) as the ids should be stored in a hash table? – Nathaniel Reinhart Mar 27 '09 at 01:18
  • I gave log n as worst, because I could see an implementation deciding to use a balanced tree instead of hash table. A hash table would be O(1) [assuming collisions are minimal], of course. – derobert Mar 27 '09 at 01:37
  • the IDs aren't always as simple as that and they'd rarely be sequential IRL, which is why you need to specify them. I do admit it was a poor/ambiguous example though. – nickf Mar 27 '09 at 02:21
  • +1, your answer is not worse than the accepted one...but your fourth solution doesn't store all data, so here's the correct fourth option: **`var people = []; people[1] = 'Joe'; people[3] = 'Sam'; people[8] = 'Eve';`** or as a two-dimensional array: `var people = []; people[1] = ['Joe', 'Smith']; people[3] = ['Sam', 'Miller']; people[8] = ['Eve', 'Sweet'];` (will be accessed by e.g. 'people[3][1];' which works very well for fast internal data storage inside any class - well, with class I mean the "new WhateverFunction();" thing which Javascripters don't like, I do ;-)... – Marcus Nov 28 '13 at 21:34
  • ...But this only works if the id is an integer, otherwise the "classic" solution 2 (together with the famous `for(var key in object)` for access) is the way to go in normal cases (it has O(n/2)), or solution 3 if you have a really big array for faster access because it has O(1)...and finally let's forget about above solution 1, not only if your id is an integer, but also in case the id is no integer: `var people = {'Smith':'Joe', 'Miller':'Sam', 'Sweet': 'Eve' };` (because you would need to access your object like this: `alert(people.Smith);` which isn't really nice!). – Marcus Nov 28 '13 at 21:34
2

Assuming the data will never change, the first (single object) option is the best.

The simplicity of the structure means it's the quickest to parse, and in the case of small, seldom (or never) changing data sets such as this one, I can only imagine that it will be frequently executed - in which case minimal overhead is the way to go.

karim79
  • 339,989
  • 67
  • 413
  • 406
2

I created a little library to manage key value pairs.

https://github.com/scaraveos/keyval.js#readme

It uses

  • an object to store the keys, which allows for fast delete and value retrieval operations and
  • a linked list to allow for really fast value iteration

Hope it helps :)

scaraveos
  • 1,016
  • 2
  • 10
  • 12
1

The third option is the best for any forward-looking application. You will probably wish to add more fields to your person record, so the first option is unsuitable. Also, it is very likely that you will have a large number of persons to store, and will want to look up records quickly - thus dumping them into a simple array (as is done in option #2) is not a good idea either.

The third pattern gives you the option to use any string as an ID, have complex Person structures and get and set person records in a constant time. It's definitely the way to go.

One thing that option #3 lacks is a stable deterministic ordering (which is the upside of option #2). If you need this, I would recommend keeping an ordered array of person IDs as a separate structure for when you need to list persons in order. The advantage would be that you can keep multiple such arrays, for different orderings of the same data set.

levik
  • 114,835
  • 27
  • 73
  • 90
0

Given your constraint that you will only ever have name as the value, I would pick the first option. It's the cleanest, has the least overhead and the fastest look up.

Nathaniel Reinhart
  • 1,173
  • 1
  • 8
  • 21