0

I am pulling a database query that has the following info:

id, name, roleId, roleTitle

In the query, I am pulling for users and their roles. Each user can have 0 to N number of roles. I want to in the end have an object like this:

{
    id
    name
    roles: [{
       id
       title
    }]
}

What would the most efficient way of doing this be? Currently I am doing something like this:

const data = [];
arr.forEach((u) => {
   const index = data.findIndex(x => x.id === u.id);
    if (index >= 0) {
      data[index].roles.push({ id: u.roleId, title: u.roleTitle });
    } else {
      data.push({
        id: u.id,
        name: u.name,
        roles: u.roleId ? [{
          id: u.roleId,
          title: u.roleTitle,
        }] : [],
      });
    }
}

This solution works correctly but wasn't sure if this was the fastest way to get this done if we scale the user numbers to 10k with an average role per user of 3 or 50k and 5 roles per user

Jared Smith
  • 19,721
  • 5
  • 45
  • 83
user081608
  • 1,093
  • 3
  • 22
  • 48

3 Answers3

2

Your best bet is actually to do this all in SQL, since you are using PostgreSQL for your database (as mentioned in comments). I don't know the exact names of your tables and columns, so you may need to tweak this, but this will get you what you want:

SELECT json_agg(t)
FROM (
  SELECT
    u.id,
    u.name,
    ro.roles
  FROM "user" u
  LEFT JOIN (
    SELECT
      ur.user_id,
      json_agg(
        json_build_object(
          'id', r.id,
          'title', r.title
        )
      ) AS roles
    FROM user_role ur
    LEFT JOIN "role" r ON r.id = ur.role_id
    GROUP BY ur.user_id
  ) ro ON ro.user_id = u.id
) t;

SQL Fiddle: http://www.sqlfiddle.com/#!17/5f6ca/11

Explanation

json_build_object will create an object using the name / value pairs specified, so:

json_build_object(
  'id', r.id,
  'title', r.title
)

Combines the role id and title into a JSON object like this:

{id: 1, title: "Title 1"}

json_agg aggregates multiple rows into a JSON array, so it converts the role objects above into a single column that is an array of role objects per user (thanks to the GROUP BY u.id part of the inner subquery). The inner subquery gives us a result set like this (one row per user)

| user_id |                       roles                          |
|---------|------------------------------------------------------|
|    1    | [{id: 1, title: "Role 1"}, {id: 2, title: "Role 2"}] |

Then the subquery is joined to the user table, and all of that is wrapped in another subquery so json_agg can be used on the entire result and return a single json object that is an array of users with roles.

patrick3853
  • 1,100
  • 9
  • 17
  • Thank you! This looks great. One last question, in the last result, it has `{"id":3,"name":"User 3","roles":[{"id" : null, "title" : null}]}`. Is there a way to return `null` for `roles` when there are no results? – user081608 Apr 17 '19 at 03:14
  • @user081608 Yes, that should be possible, give me a minute to update it. – patrick3853 Apr 17 '19 at 03:23
  • @user081608 Okay, I've updated it to return `null` for `roles` when there are no roles. Sorry for all the edits to the answer, I had trouble linking to the correct sql fiddle. – patrick3853 Apr 17 '19 at 03:39
  • @user081608 Glad to help! Databases adding native JSON support is really powerful (MySQL added it in 5.7). Anytime you are manipulating a large data set or performing aggregation, it's usually faster to do it at the database level. – patrick3853 Apr 17 '19 at 04:10
1

This almost certainly isn't the most efficient possible version but is faster than what you're doing now:

const data = Object.values(arr.reduce((obj, {id, name, roleId, roleTitle}) => {
  if (!(id in obj)) {
    obj[id] = {
      id,
      name,
      roles: {},
    };
  }
  if (!obj[id].roles[roleId]) {
    obj[id].roles[roleId] = {
      id: roleId,
      title: roleTitle,
    };
  }
  return obj;
}, {}));

By using objects (hashes) instead of arrays, determining if the user is already there or if the user already has a role is a constant-time O(1) operation (the cost of the hashing function). But searching an array, depending on the search method used, is linear in the worst case O(n) and even the best case is O(log n).

You could go down the rabbit hole of micro-optimizations that will change with the wind, but choosing the correct data structures and algorithms will usually get you the most bang for your optimization buck.

I've used Object.values to convert back to an array at the end, if you omit this and just stick with objects it could be even faster.

Jared Smith
  • 19,721
  • 5
  • 45
  • 83
  • Thanks Jared, makes sense. I've done this in Java before but didn't know the same concept hold true here – user081608 Apr 17 '19 at 02:29
  • @user081608 yup. Javascript is getting better about datastructures. I'd stick with objects and arrays over Maps and Sets for this since it sounds like you'll probably be JSON serializing it at some point. Hope it works for you! – Jared Smith Apr 17 '19 at 02:31
0

Hope this helps.

var modified_array = function(xs, key) {
  return xs.reduce(function(rv, x) {
    obj = (rv[x[key]] = rv[x[key]] || {});
    obj.id = x.id;
    obj.name = x.name;
    obj.roles = obj.roles || []
    obj.roles.push({ id: x.roleId, title: x.roleTitle})
    return rv;
  }, {});
};

arr = [{id:1,name:"abd",roleId: 10,roleTitle: "hello"},
       {id:1, name: "abd", roleId: 15,roleTitle: "fello"}]    
console.log( Object.values(modified_array(arr, 'id')));
pramod
  • 2,258
  • 1
  • 17
  • 22