0

I have a JSONB column in my database which contain nested JSON. I need to query this column but I want to preserve JSON structure and only drop unnecessary key values. Assuming this is the sample record in database:

{
  "id": 1,
  "foo": "bar",
  "something": "anything",
  "something_else": "anything_else",
  "video": [
    {
      "id": 1,
      "type": "slow",
      "width": 700,
      "height": 900
    },
    {
      "id": 2,
      "type": "medium",
      "width": 900,
      "height": 100
    },
    {
      "id": 3,
      "type": "fast",
      "width": 700,
      "height": 900
    }
  ],
  "image": {
    "candidates": [
      {
        "width": 480,
        "height": 600,
        "scans_profile": "e35"
      },
      {
        "width": 480,
        "height": 600,
        "scans_profile": "e35"
      }
    ]
  }
}

I need to get:

{
  "id": 1,
  "foo": "bar",
  "video": [
    {
      "id": 1,
      "width": 700,
      "height": 900
    },
    {
      "id": 2,
      "width": 900,
      "height": 100
    },
    {
      "id": 3,
      "width": 700,
      "height": 900
    }
  ],
  "image": {
    "candidates": [
      {
        "scans_profile": "e35"
      },
      {
        "scans_profile": "e35"
      }
    ]
  }
}

Is there an elegant, simple way to do that? or is there any way at all?

I am using postgres 13.3

IamMashed
  • 1,811
  • 2
  • 21
  • 32
  • 2
    There is no builtin `filter_keys_recursive` function if that's what you're looking for. You'll have to explicitly select those values that you want to keep and build a new JSON object from it using the JSON processing functions. – Bergi Sep 06 '21 at 20:22
  • 1
    Have a look in https://stackoverflow.com/questions/58802312/remove-key-value-pair-from-jsonb-nested-array-in-postgresql#58802637 – Ronaldo Ferreira de Lima Sep 06 '21 at 20:23
  • 2
    If you had stored this not as a JSON,, but in a normal relational fashion in several tables, the task would be simple. – Laurenz Albe Sep 07 '21 at 02:52

1 Answers1

2

if your source data always has the same format, you can just delete the paths you don't require. this will work because you're filtering out specific keys & not doing other transformations. this will fail if the source can have a varying number of keys or variable array lengths not known at the time of writing this query, since you won't be deleting those other unknown paths.

i.e. if the top-level keys are always [id, key, something, something_else, video, image]

where video is always an array of json objects with keys [id, type, width, height] with length 3

and image is an an object with one key candidates which is an array of length 2 of objects with keys [width, height, scans_profile] then you may write the following:

SELECT
my_json_column 
  #- '{something}'
  #- '{something_else}'
  #- '{video,0,type}'
  #- '{video,1,type}'
  #- '{video,2,type}'
  #- '{image,candidates,0,width}'
  #- '{image,candidates,1,width}'
  #- '{image,candidates,0,height}'
  #- '{image,candidates,1,height}'
AS my_new_json_column
FROM my_table

here's a fiddle: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=508e6b864ff811289b4610c4c2380df8

otherwise, your only option is to build up a new jsonb object

SELECT
  jsonb_build_object(
    'id', my_json_column->'id',
    'foo', my_json_column->'foo',
    'video', video.video,
    'image', jsonb_build_object('candidates', image_candidates.image_candidates)
  ) AS my_new_json_column
FROM my_table
LEFT JOIN LATERAL (
  SELECT jsonb_agg(jsonb_build_object(
      'id', y->'id', 
      'width', y->'width', 
      'height', y->'height'
  )) video
  FROM jsonb_path_query(my_json_column, '$.video[*]') v(y)
) video ON true
LEFT JOIN LATERAL (
  SELECT jsonb_agg(jsonb_build_object(
      'scans_profile', y->'scans_profile'
    )) image_candidates
  FROM jsonb_path_query(my_json_column, '$.image.candidates[*]') v(y)
) image_candidates ON true

Haleemur Ali
  • 26,718
  • 5
  • 61
  • 85
  • your first method does not work. Could you please provide sql fiddle example? – IamMashed Sep 07 '21 at 14:33
  • you're right, the delete operation does not work as i expect it to. it might be possible to implement the same strategy using other json operations. i'll take a look at it when i'm free. – Haleemur Ali Sep 07 '21 at 17:15
  • so, there doesn't seem to be a way to delete keys from all documents in an json array in postgresql yet, so `video[*].type` will not match on any of the keys. as such, I'm updating my first suggestion to delete keys such as `{video,0,type}`, `{video,1,type}`, etc. independently. note also the fix in how the path elements are specified – Haleemur Ali Sep 08 '21 at 02:48
  • 1
    hmmm that may work. video list is not constant but always less than 10, so i guess i can just add more lines of `#- '{video,2,type}'` till i reach `#- '{video,10,type}'` – IamMashed Sep 09 '21 at 19:40