2

I have a complex json object saved on a postgresql database and I want to find all entries that contain or begin with a substring inside an array. The json object:

    "configurations" : {
  "CVE_data_version" : "4.0",
  "nodes" : [ {
    "operator" : "OR",
    "cpe_match" : [ {
      "vulnerable" : true,
      "cpe23Uri" : "cpe:2.3:a:apache:http_server:*:*:*:*:*:*:*:*",
      "versionStartIncluding" : "2.4.0",
      "versionEndIncluding" : "2.4.41"
    } ]
  } ]
}

More specifically, I want to find all objects that begin with : "cpe:2.3:a:apache" in the "cpe23Uri" field.

Query I have made:

session.query(cvemodel.data['configurations']['nodes'][0]['cpe_match'].contains([{'cpe23Uri': 'cpe:2.3:a:apache:http_server:*:*:*:*:*:*:*:*'}])).all()

Problem with this query is that, it matches the whole word. If I put this:

session.query(cvemodel.data['configurations']['nodes'][0]['cpe_match'].contains([{'cpe23Uri': 'cpe:2.3:a:apache:http_server'}])).first()

It does not return anything!

1 Answers1

1

If you are using PostgreSQL 12 or later, you can use jsonb_path_exists() and friends:

needle = 'cpe:2.3:a:apache:http_server'

session.query(cvemodel).\
    filter(func.jsonb_path_exists(
        cvemodel.data,
        '''$.configurations.nodes[0].cpe_match[*].cpe23Uri ?
           (@ starts with $needle)''',
        json.dumps({"needle": needle}))).\
    all()

If you want to check if cpe23Uri contains the needle, you can use the like_regex predicate, especially if you have a static "needle"; unfortunately like_regex only accepts string literals as right operand. Another option for a "contains" query would be to extract the cpe23Uri using either jsonb_path_query(), or accessors and jsonb_array_elements(), and use traditional LIKE, as described here.

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127