1

We have a situation that can be best described in terms of an analogy. Let us assume that we have a mapping between name and city+state+zipcode. Here the name is guaranteed to be unique. Our sample data (adapted from here) is as follows:

James Butt: New Orleans LA 70116
Josephine Darakjy: Brighton MI 48116
Art Venere: Bridgeport NJ 8014
Lenna Paprocki: Anchorage AK 99501
Donette Foller: Hamilton OH 45011
Simona Morasca: Ashland OH 44805
Mitsue Tollner: Chicago IL 60632
Leota Dilliard: San Jose CA 95111
Sage Wieser: Sioux Falls SD 57105
Kris Marrier: Baltimore MD 21224

What would be a good JSON structure in this case? We are looking for something that is more efficient, from a processing perspective, for JavaScript. We see at least two options.

Option 1

{
"James Butt":  "New Orleans LA 70116",
"Josephine Darakjy":  "Brighton MI 48116",
"Art Venere":  "Bridgeport NJ 8014",
"Lenna Paprocki":  "Anchorage AK 99501",
"Donette Foller":  "Hamilton OH 45011",
"Simona Morasca":  "Ashland OH 44805",
"Mitsue Tollner":  "Chicago IL 60632",
"Leota Dilliard":  "San Jose CA 95111",
"Sage Wieser":  "Sioux Falls SD 57105",
"Kris Marrier":  "Baltimore MD 21224"
}

Option 2

[
   {
      "name": "James Butt",
      "address": "New Orleans LA 70116"
   },
   {
      "name": "Josephine Darakjy",
      "address": "Brighton MI 48116"
   },
   {
      "name": "Art Venere",
      "address": "Bridgeport NJ 8014"
   },
   {
      "name": "Lenna Paprocki",
      "address": "Anchorage AK 99501"
   },
   {
      "name": "Donette Foller",
      "address": "Hamilton OH 45011"
   },
   {
      "name": "Simona Morasca",
      "address": "Ashland OH 44805"
   },
   {
      "name": "Mitsue Tollner",
      "address": "Chicago IL 60632"
   },
   {
      "name": "Leota Dilliard",
      "address": "San Jose CA 95111"
   },
   {
      "name": "Sage Wieser",
      "address": "Sioux Falls SD 57105"
   },
   {
      "name": "Kris Marrier",
      "address": "Baltimore MD 21224"
   }
]
  
Sandeep
  • 1,245
  • 1
  • 13
  • 33
  • Both can be always easily converted back and forth. Regarding accessing data, the first can leverage iterations since you can access a unique record directly by its key property (Name). I would prefer to have and ID instead of the name as key. – Roko C. Buljan Jul 12 '20 at 20:51
  • I would go for two. The structure resamble what you can get from a db query. As @RokoC.Buljan says, adding an unique ID would be the best – Raffobaffo Jul 12 '20 at 21:02

3 Answers3

2

Option 1: This is a bad idea even if you are completely certain there are no duplicate names. Firstly, there is no real-world situation in which there are no duplicate names. Secondly, what if the name contains a special character?

Option 2: This is the right way to format the data, except as noted in the comments there should also be a unique ID field mapped to each name.


const data = [
   {
      "id": 1, 
      "name": "James Butt",
      "address": "New Orleans LA 70116"
   }, 
   ...
];

user = data.filter(({id})=>id==1); //gives you this user
user = data.filter(({name})=>name=='James Butt'); //also works

The reason to add the unique ID inside the data construct is to avoid relying on the order these are inserted into the data array. Any data coming from a real world database will always have some sort of unique ID field, which may be numeric or alphanumeric. But the order in which that data is returned will not be a reliable indicator of what the true id of the user is.

joshstrike
  • 1,753
  • 11
  • 15
0

Speaking of efficiency (and to prevent duplicated name-key clash)
I'd map the string-data into an Object where keys are Unique IDs (just like in a database):

{
   7 : {
      "id": 7,
      "name": "James Butt",
      "address": "New Orleans LA 70116"
   },
   19 : {
      "id": 19,
      "name": "Josephine Darakjy",
      "address": "Brighton MI 48116"
   }
}

Extracting a single record:

We gain the speed by immediately extracting a record given a key ID just like a lookup for a memory address - instead of always iterating the (in the worst case - the entire) array in search for a single record.

Extracting Multiple records:

You can always iterate Objects using Object.keys(), Object.values() or Object.entries() - and loop, just like you would with Array.

Since from ES2015 object keys order is predictable, you can effortlessly have the Object ordered by ID out of the box - giving that by default the records are organized by creation-date - since DB rows IDs are incremental.

Example: Get record by ID or Value

const users = {
   6 : {
      "id": 6,
      "name": "Antonette Darakjy",
      "address": "Brighton MI 48116"
   },
   7 : {
      "id": 7,
      "name": "James Butt",
      "address": "New Orleans LA 70116"
   },
   19 : {
      "id": 19,
      "name": "Josephine Darakjy",
      "address": "Brighton MI 48116"
   },
};

// GET BY ID
console.log(users[7]); // No need to iterate tens of thousands of records

// GET BY KEY VALUE
// Return an array of filtered users 
const filterBy = (prop, val) => Object.values(users).filter(user => {
  return user[prop] === val;
});
console.log(filterBy("address", "Brighton MI 48116"));
Roko C. Buljan
  • 196,159
  • 39
  • 305
  • 313
  • 1
    Using the id as an object key in addition to storing it in the data itself can be a good answer for efficiency, as long as you know that the keys are both unique and properly escaped. In many cases we keep a normal array and also an object linking to the same array elements that's keyed by ID. Specifically because if we want to search by ID, the hashed key search is very fast, but if you want to filter or loop them it's faster to do so on the array (and doesn't require writing your own map/filter/reduce code). – joshstrike Jul 12 '20 at 22:05
  • @joshstrike if you want to loop, filter an Array, yes *you do* need to write some code, and yes using `filter()`, `reduce()` or any other Array.prototype method - which is in speed in the V8 engine as fast as using `for` – Roko C. Buljan Jul 12 '20 at 22:10
  • This is some absolutely unjustified use of `reduce()`... A simple loop would be easier to read and way more efficient. Also, `Object.values()` just doubles memory consumption. All for the sake of applying FP where it doesn't belong. – x1n13y84issmd42 Jul 12 '20 at 22:15
  • 1
    @RokoC.Buljan My point was only that using .filter() on an array is faster than using Object.values(n).reduce() with a hash lookup inside of it. That would be the reason for keeping two copies of the model, one in a flat array and another set of references by id in an object. – joshstrike Jul 13 '20 at 00:36
  • 1
    RokoCBuljan, JSBench reference? An integer indexed for loop (on homogeneous data) is almost always an order of magnitude faster than reduce (on V8) at a certain scale no matter how well implemented in JS, although this is overstated because it's doubtful it will be your bottleneck since the speed is on such a small scale. @x1n13y84issmd42 it's a shallow copy of references that can be immediately GCed (and mostly optimized away), so the whole doubling of memory is overstated, even if somewhat true. I can't say I disagree with the sentiment on reduce, although that's highly subject to opinion. – user120242 Jul 13 '20 at 00:54
  • 1
    @x1n13y84issmd42 absolutely agree. `.reduce` was indeed unjustified. – Roko C. Buljan Jul 13 '20 at 10:55
0

Not sure what you really mean by "processing efficiency", but guessing you would like to efficiently look up people addresses by their names. The proposed solutions require iterating over entire collection, so they'll work on 500 records and won't on 1M.

Also, as other people have mentioned, name collisions are real (especially when you get that 1M records set), so assuming you need two features from your data structure:

  1. efficient address lookups by name;
  2. no name collisions.

This means you need a multimap where keys are people names and values are sets of addresses. Basically, your first option with arrays of addresses as field values. Alternatively, you can use a Map for more efficient additions/removals and easier access to the number of records (unique people names) stored.

Maps (and Sets) in V8 implementation are said to have O(1) time complexity vs the linear time when using arrays. So they're efficient.

x1n13y84issmd42
  • 890
  • 9
  • 17