2

I have a nested JSON in a field that contains multiple important keys that I would like to retrieve as an array:

{
  "tasks": [
    {
      "id": "task_1",
      "name": "task_1_name",
      "assignees": [
        {
          "id": "assignee_1",
          "name": "assignee_1_name"
        }
      ]
    },
    {
      "id": "task_2",
      "name": "task_2_name",
      "assignees": [
        {
          "id": "assignee_2",
          "name": "assignee_2_name"
        },
        {
          "id": "assignee_3",
          "name": "assignee_3_name"
        }
      ]}]}

All the queries that I've tried so far fx ( $.tasks.*.assignees..id) and many others have returned

[
  "assignee_1",
  "assignee_2",
  "assignee_3"
]

But what I need is:

[
  ["assignee_1"],
  ["assignee_2", "assignee_3"]
]

Is it possible to do with JSONPath or any script inside of it, without involving 3rd party tools?

Aleeb
  • 35
  • 1
  • 5
  • 1
    The basic issue is that JSONPath is a *selection* language. It can select any of the items that are in your JSON, and it will give you a list of matches, such as `["assignee_1", "assignee_2", "assignee_3"]`. The result you want is not a selection, but a transformation - practically mapping "task" objects to arrays (`root.tasks.map(t => t.assignees.map(a => a.id))`. This is not something that JSONPath has been designed to do. – Tomalak Apr 21 '22 at 08:38
  • Similar question over here: https://stackoverflow.com/questions/30221930/jsonpath-to-transform-and-map-to-different-tiers-of-data-with-a-single-output – Tomalak Apr 21 '22 at 08:39
  • That's a pity. Thank you for clarifying things @Tomalak ! – Aleeb Apr 21 '22 at 11:27
  • What environment are you in? – Tomalak Apr 21 '22 at 12:27
  • I'm using Pentaho Data Integrator with Json Input step – Aleeb Apr 21 '22 at 12:34
  • I suppose you could [use a Javascript step](https://intellipaat.com/blog/tutorial/pentaho-tutorial/transforming-your-data-with-javascript-code-and-the-javascript-step/) instead - the code to get the data structure you want is in my first comment. – Tomalak Apr 21 '22 at 12:44
  • I may try to plug it into the stream indeed, I just hoped there's a solution to do it within the Json step using Json path. Thanks for the help! – Aleeb Apr 21 '22 at 12:48

1 Answers1

2

The problem you're facing is that tasks and assignees are arrays. You need to use [*] instead of .* to get the items in the array. So your path should look like

$.tasks[*].assignees[*].id

You can try it at https://json-everything.net/json-path.

NOTE The output from my site will give you both the value and its location within the original document.

Edit

(I didn't read the whole thing :) )

You're not going to be able to get

[
  ["assignee_1"],
  ["assignee_2", "assignee_3"]
]

because, as @Tomalak mentioned, JSON Path is a query language. It's going to remove all structure and return only values.

gregsdennis
  • 7,218
  • 3
  • 38
  • 71