2

I have some columns in my Oracle database that contains json and to extract it's data in a query, I use REGEXP_SUBSTR.

In the following example, value is a column in the table DOSSIER that contains json. The regex extract the value of the property client.reference in that json

SELECT REGEXP_SUBSTR(value, '"client"(.*?)"reference":"([^"]+)"', 1, 1, NULL, 2) FROM DOSSIER;

So if the json looks like this :

[...],
"client": {
  "someproperty":"123",
  "someobject": {
    [...]
  },
  "reference":"ABCD",
  "someotherproperty":"456"
},
[...]

The SQL query will return ABDC.

My problem is that some json have multiple instance of "client", for example :

[...],
"contract": {
  "client":"Name of the client",
  "supplier": {
    "reference":"EFGH"
  }
},
[...],
"client": {
  "someproperty":"123",
  "someobject": {
    [...]
  },
  "reference":"ABCD",
  "someotherproperty":"456"
},
[...]

You get the issue, now the SQL query will return EFGH, which is the supplier's reference.

How can I make sure that "reference" is contained in a json object "client" ?

EDIT : I'm on Oracle 11g so I can't use the JSON API and I would like to avoid using third-party package

Natty
  • 497
  • 1
  • 11
  • 23

2 Answers2

3

Assuming you are using Oracle 12c or later then you should NOT use regular expressions and should use Oracle's JSON functions.

If you have the table and data:

CREATE TABLE table_name ( value CLOB CHECK ( value IS JSON ) );

INSERT INTO table_name (
  value
) VALUES (
  '{
  "contract": {
    "client":"Name of the client",
      "supplier": {
        "reference":"EFGH"
    }
  },
  "client": {
    "someproperty":"123",
    "someobject": {},
    "reference":"ABCD",
    "someotherproperty":"456"
  }
}'
);

Then you can use the query:

SELECT JSON_VALUE( value, '$.client.reference' ) AS reference
FROM   table_name;

Which outputs:

REFERENCE
ABCD

db<>fiddle here


If you are using Oracle 11 or earlier then you could use the third-party PLJSON package to parse JSON in PL/SQL. For example, this question.


Or enable Java within the database and then use CREATE JAVA (or the loadjava utility) to add a Java class that can parse JSON to the database and then wrap it in an Oracle function and use that.

MT0
  • 143,790
  • 11
  • 59
  • 117
  • Thank you for this answer, unfortunately I'm using Oracle 11g, so first option isn't possible. Second option could be neat but my client won't install 3rd party package on his server so I'm stuck on this end too. I might use their scripts to find a way of doing what I want though – Natty Jul 06 '21 at 15:12
  • @Natty Maybe you could enable Java in the database and use a Java class to do the parsing? Or you can suggest to the client that they upgrade the database? (Trying to do it with regular expressions or other string functions is not going to work). – MT0 Jul 06 '21 at 17:35
  • The database upgrade is a work in progress on their infrastructure (to be fair, it's been a work in progress for over a year now...), which is why they don't want to use 3rd party for "only a few months". Never heard of java code used in an Oracle function, I'll check how that works for sure ! – Natty Jul 07 '21 at 09:15
  • 1
    @Natty I've previously done a few answers with examples; you can search StackOverflow for [user:1509264 \[oracle\] CREATE JAVA](https://stackoverflow.com/search?q=user%3A1509264+%5Boracle%5D+CREATE+JAVA) which will give a range of answers from the complex example which [unzips archives](https://stackoverflow.com/a/59250531/1509264) to the simple [hello world](https://stackoverflow.com/a/60600349/1509264). – MT0 Jul 07 '21 at 09:25
0

I faced similar issue recently. If "reference" is a property that is only present inside "client" object, this will solve:

SELECT reference FROM (
    SELECT DISTINCT 
        REGEXP_SUBSTR(
            DBMS_LOB.SUBSTR(
                value,
                4000
            ),
            '"reference":"(.+?)"',
            1, 1, 'c', 1) reference
    FROM DOSSIER
) WHERE reference IS NOT null;

You can also try to adapt the regex to your need.

Edit:

In my case, column type is CLOB and that's why I use DBMS_LOB.SUBSTR function there. You can remove this function and pass column directly in REGEXP_SUBSTR.

Ícaro Erasmo
  • 125
  • 1
  • 9