0

I have a JSON that I am trying to do some filtering and then count the number of elements returned. However it seems to work incorrectly.

Here is the sample data and code to replicate.

import json
from jsonpath_ng.ext import *

data_json = """
{
    "some_key":"some_value",
    "level_1":[
        {
            "level_2_name" : "abc",
            "level_2_attr" : "123"
        },
        {
            "level_2_name" : "def",
            "level_2_attr" : "123"
        },
        {
            "level_2_name" : "ghi",
            "level_2_attr" : "123"
        }
    ]
}
"""

data_dict = json.loads(data_json)
print(data_dict)

path_expr = "$.level_1[? level_2_name == 'abc']['level_2_name'].`len` "
val_lst = parse(path_expr).find(data_dict)
for item in val_lst:
    print(item.value)

This code block returns the value 3 which is the length of string "abc" instead of 1 which is the number of times we find "abc" in the filter.

Deep diving shows if i remove the "len" at the end , it is returning a list and the list has only one element. so the filtering is working right. Is there a bug in the library or do I need to tweak the expression?

Further analysis seems to show that "len" only works on items that are (list of list) and not on (list of items). In this case the filters are returning 'list of string' instead of 'list of list of string'.

EDIT 1 : For new repliers, I do know that len(list) can be used as python code but that is not what I am looking for. I was trying to see if there is an out of box solution and if had an error in my expression. apologies if it was not clear earlier.

EDIT 2: I gave up this approach. I had to parse a 6k line complex nested json and extract 400+ values from it in real time for my application. If I use this library it was taking ~20 secs. Doing it in python with a lot of if-else in between was able to do it in < 1 sec . So for the sake of speed abandoning this path.

  • Perhaps you want to omit `.len` from your path and then access the length of `val_lst`? – Paulw11 Mar 15 '23 at 06:59
  • @paulw - just like i mentioned to 'yzx' (see below answer and discussions there) i wanted something out of box so that i dont do extra processing or logic on my end. – web profiler Mar 16 '23 at 05:17

1 Answers1

0

I did some testing for your code and like you said, the .'len' finds the length of the string not the list, however, if you simply needed the length, you could use Python's len() function

import json
from jsonpath_ng.ext import parse

def main():
    data_dict = json.loads(data_json)
    print(data_dict)

    path_expr = "$.level_1[?  level_2_name == 'abc'].level_2_name"
    val_lst = parse(path_expr).find(data_dict)
    for item in val_lst:
        print(item.value)
    print(len(val_lst))

# changed the data, imports and moved the code to ease reading and make debugging easier
data_json = """
{
    "some_key":"some_value",
    "level_1":[
        {
            "level_2_name" : "abc",
            "level_2_attr" : "12"
        },
        {
            "level_2_name" : "def",
            "level_2_attr" : "34"
        },
        {
            "level_2_name" : "abc",
            "level_2_attr" : "45"
        }
    ]
}
"""
main()


apologies for the usage of ' instead of `

yxz
  • 112
  • 9
  • whilst your solution works, my challenge is I am having a json with 6k lines and I am picking some values as is, sometimes i want to count, and sometimes i want to do arithmetic operations. So if i go the custom way, I will have to end up with so many custom stuff. I want to explore out of box solutions before going down that path. plus this clearly seems like a bug. – web profiler Mar 15 '23 at 01:13
  • @webprofiler looked a bit online first and couldn't find much documentation on python's jsonpath implementations and a couple of syntax given seem not to be supported (giving parse error for me). For your concerns considering custom functions I totally agree and it is better to use out of the box solutions. For performance I tested a bit with the same code and `python -m timeit -s` with about 1790k of formatted json coming out to 400k of `level_2_name`s coming at 10nsecs so should not be anything very performance impacting. – yxz Mar 15 '23 at 02:18
  • thanks for doing that extra check. Will see if i get any other replies otherwise will have to run with it. Like you did I first tried documentation , git repo etc but its shockingly bad.. that one [page](https://pypi.org/project/jsonpath-ng/) is practically the only one and I get syntax errors when i try some string operations as well. – web profiler Mar 15 '23 at 02:48
  • @webprofiler yes, apparently this module is a merge between jsonpath-rw and jsonpath-rw-ext but these two module's documentations are almost equally minimal – yxz Mar 15 '23 at 03:34