5

Trying my hand at JSONB datatype first time(discussion continued from (Join tables using a value inside a JSONB column) on advise from @Erwin , starting new thread)

Two tables (obfuscated data and table names):

  1. Discussion table { discussion_id int, contact_id, group_id, discussion_updates jsonb } [has around 600 thousand rows]
  2. Authorization table { user_id varchar , auth_contacts jsonb, auth_groups jsonb } [has around 100 thousand rows]

auth_contacts jsonb data has key value pairs data (as example)

{ "CC1": "rr", "CC2": "ro" }

auth_groups jsonb data has key value pairs data (as example)

{ "GRP1": "rr", "GRP2": "ro" }
  1. First, on inserts in database via Java JDBC. What I am doing is :

    JSONObject authContacts = new JSONObject();
    
    for (each record in data) {
        authContacts.put(contactKey, contactRight);
        authGroups.put(groupKey, groupRight);
    }
    
    String insertSql = "INSERT INTO SSTA_AuthAll(employee_id, auth_contacts, auth_groups) VALUES(?,?::jsonb,?::jsonb)";
    //-- Connect to Db and prepare query
    preparedStatement.setObject(2, authContacts.toJSONString());
    preparedStatement.setObject(3, authGroups.toJSONString());
    
    // INSERT into DB
    

    Now, the toJSONString() takes time (as much as 1 second sometimes - TIME FOR toJSON STRING LOOP:17238ms) which again is inefficient.

    So again is this right way to do it ? Most examples on google directly have a string which they insert.

    If I directly insert a MAP into jsonb column, it expects an HSTORE extension which is what I shouldn't be using if I am going for jsonb?

  2. Now on the next part:

    I need to join contact_id from discussion table with contact_id of auth_contacts json datatype [which is key as shown in example above] and join group_id of auth_groups with group_id of discussion table

    As of now tried join only on contact_id:

    SELECT *
    FROM discussion d 
    JOIN 
        (SELECT user_id, jsonb_object_keys(a.contacts) AS contacts  
         FROM auth_contacts a 
         WHERE user_id = 'XXX') AS c ON (d.contact_id = c.contacts::text)
    ORDER BY d.updated_date DESC
    

    This join for a user who has around 60 thousand authorized contacts takes around 60 ms and consecutive runs lesser - Obfuscated explain plan is as follows:

    "Sort  (cost=4194.02..4198.39 rows=1745 width=301) (actual time=50.791..51.042 rows=5590 loops=1)"
    "  Sort Key: d.updated_date"
    "  Sort Method: quicksort  Memory: 3061kB"
    "  Buffers: shared hit=11601"
    "  ->  Nested Loop  (cost=0.84..4100.06 rows=1745 width=301) (actual time=0.481..44.437 rows=5590 loops=1)"
    "        Buffers: shared hit=11598"
    "        ->  Index Scan using auth_contacts_pkey on auth_contacts a  (cost=0.42..8.93 rows=100 width=888) (actual time=0.437..1.074 rows=1987 loops=1)"
    "              Index Cond: ((user_id)::text = '105037'::text)"
    "              Buffers: shared hit=25"
    "        ->  Index Scan using discussion_contact_id on discussion d  (cost=0.42..40.73 rows=17 width=310) (actual time=0.016..0.020 rows=3 loops=1987)"
    "              Index Cond: ((contact_id)::text = (jsonb_object_keys(a.contacts)))"
    "              Buffers: shared hit=11573"
    "Planning time: 17.866 ms"
    "Execution time: 52.192 ms"
    

My final aim is an additional join in the same query with group_id too. What jsonb_object_keys does is actually create a userid vs authcontacts mapping of each key. So for a user with 60 thousand contacts it will create a view of 60 thousand rows (probably in memory). Now if I include join on auth_groups (which for sample user with 60 thousand contacts would have around 1000 thousand groups which would make the query slower.

So is this the right way to do join on jsonb object and is there a better way to do this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Prachi Tripathi
  • 179
  • 1
  • 2
  • 6

0 Answers0