7

I'm working with MySQL 8.0.21. I need to write a query that works with the JSON column type. Some of the data inside the JSON documents have null values and I want to filter out these null values.

Examples of possible rows, most properties in the JSON document have been removed for simplicity:

jsonColumn
'{"value":96.0}'
'{"value":null}' -- This is the row I am trying to filter out
NULL

Here is what I've tried:

-- Removed columns where jsonColumn was NULL but, NOT columns where jsonColumn->'$.value' was null. 
SELECT * 
FROM <table>
WHERE jsonColumn->'$.value' IS NOT NULL;

-- Note the unquote syntax, ->>. The code above uses ->.
-- Produced the same result as the code above.
SELECT * 
FROM <table>
WHERE jsonColumn->>'$.value' IS NOT NULL;

-- Produced same result as the two above. Not surprised because -> is an alias of JSON_EXTRACT
SELECT * 
FROM <table>
WHERE JSON_EXTRACT(jsonColumn, '$.value') IS NOT NULL;

-- Produced same result as the three above. Not surprised because ->> is an alias of JSON_EXTRACT
SELECT * 
FROM <table>
WHERE JSON_UNQUOTE(JSON_EXTRACT(jsonColumn, '$.value')) IS NOT NULL;

-- Didn't really expect this to work. It didn't work. For some reason it filters out all records from the select.
SELECT *
FROM <table>
WHERE jsonColumn->'$.value' != NULL;

-- Unquote syntax again. Produced the same result as the code above.
SELECT *
FROM <table>
WHERE jsonColumn->>'$.value' != NULL;

-- Didn't expect this to work. Filters out all records from the select.
SELECT *
FROM <table>
WHERE JSON_EXTRACT(jsonColumn, '$.value') != NULL;

-- Didn't expect this to work. Filters out all records from the select.
SELECT *
FROM <table>
WHERE JSON_UNQUOTE(JSON_EXTRACT(jsonColumn, '$.value')) != NULL;

-- I also tried adding a boolean value to one of the JSON documents, '{"test":true}'. These queries did not select the record with this JSON document.
SELECT * 
FROM <table>
WHERE jsonColumn->'$.test' IS TRUE;
SELECT * 
FROM <table>
WHERE jsonColumn->>'$.test' IS TRUE;

A few interesting things I noticed...

Comparing other values worked. For example...

-- This query seems to work fine. It filters out all records except those where jsonColumn.value is 96.
SELECT *
FROM <table>
WHERE jsonColumn->'$.value' = 96;

Another interesting thing I noticed, which was mentioned in the comments for some of the examples above, was some odd behaviour for the null checks. If jsonColumn was null, the null checks would filter out the record even know I was accessing jsonColumn->'$.value'.

Not sure if this is clear, so let me elaborate a little...

-- WHERE jsonColumn->>'$.value' IS NOT NULL
jsonColumn
'{"value":96.0}'
'{"value":null}' -- This is the row I am trying to filter out. It does NOT get filtered out.
NULL -- This row does get filtered out.

According this post, using ->> and JSON_UNQUOTE & JSON_EXTRACT with IS NOT NULL comparisons should have worked. I assume it worked back then.

Honestly feeling like this may be a bug with the IS statement and JSON column type. There is already weird behaviour where it's comparing against the JSON document rather than the JSON document's values.

Regardless, is there any way to accomplish this? Or are the ways I've been trying confirmed to be the correct way and this is just a bug?

Tyler
  • 957
  • 11
  • 27
  • 1
    Apparently this changed sometime before 8.0.13. https://forums.mysql.com/read.php?176,670072,670072 – Barmar Aug 25 '20 at 14:47
  • The workaround is in that forum thread. It's gross. – Barmar Aug 25 '20 at 14:48
  • @Barmar Well that sucks but, this gets interesting... `WHERE virtualPCI->>'$.value' != 'null'` Seems to be filtering both the expected values and unexpected values. Will update OP seem it will be hard to fit cleanly in a comment. – Tyler Aug 25 '20 at 14:58
  • 1
    That thread is almost 2 years old, doesn't oracle care about bugs in their system? – Daniel W. Aug 25 '20 at 15:09
  • With the information from the forum posted by Barmar and my tests with the string comparisons, I think I will post an answer assuming the site lets me. – Tyler Aug 25 '20 at 15:11

3 Answers3

5

Following Barmar's comment...

Apparently this changed sometime before 8.0.13. forums.mysql.com/read.php?176,670072,670072

A workaround in the forum post seems to use JSON_TYPE. Looks like a terrible workaround tbh.

SET @doc = JSON_OBJECT('a', NULL);
SELECT JSON_UNQUOTE(IF(JSON_TYPE(JSON_EXTRACT(@doc,'$.a')) = 'NULL', NULL, JSON_EXTRACT(@doc,'$.a'))) as C1,
JSON_UNQUOTE(JSON_EXTRACT(@doc,'$.b')) as C2;

The forum post says (regarding code posted before the workaround)...

C2 is effectively set as NULL, but C1 is returned as the 4 char 'null' string.

So I started messing around with string comparisons...

// This filtered out NULL jsonColumn but, NOT NULL jsonColumn->'$.value'
SELECT *
FROM <table>
WHERE jsonColumn->'$.value' != 'null';

jsonColumn
'{"value":96.0}'
'{"value":"null"}' -- Not originally apart of my dataset but, this does get filtered out. Which is very interesting...
'{"value":null}' -- This does NOT get filtered out.
NULL -- This row does get filtered out.

// This filtered out both NULL jsonColumn AND NULL jsonColumn->'$.value'
SELECT *
FROM <table>
WHERE jsonColumn->>'$.value' != 'null';

jsonColumn
'{"value":96.0}'
'{"value":"null"}' -- Not originally apart of my dataset but, this does get filtered out.
'{"value":null}' -- This does get filtered out.
NULL -- This row does get filtered out.
Tyler
  • 957
  • 11
  • 27
  • Can't accept this as an answer for 2 days apparently. – Tyler Aug 25 '20 at 15:14
  • 3
    I think it will also filter out `{"value":"null"}`. That may not be a problem in your use case, though. – Barmar Aug 25 '20 at 15:16
  • @Barmar Good catch, probably won't be a problem for me. I'll do some quick tests to have a more complete answer. – Tyler Aug 25 '20 at 15:16
3

Using the function JSON_VALUE (introduced in 8.0.21) it's possible to correctly read null values from columns with JSON data. Based on your original example code

SELECT * 
FROM <table>
WHERE JSON_VALUE(jsonColumn, '$.value') IS NOT NULL;

You can also verify this without tables using a query like

SELECT 
 JSON_VALUE('{"foo": null}', "$.foo"),
 JSON_VALUE('{"foo": 96.0}', "$.foo");

Take care of this note from the documentation

If not specified by a RETURNING clause, the JSON_VALUE() function's return type is VARCHAR(512)

For example if you're extracting a nested JSON object that may exceed 512 characters, then you can specify that like JSON_VALUE(jsonColumn, '$.value' RETURNING JSON), you can of course specify other types such as UNSIGNED, DOUBLE, etc if you are sure of the expected data.

aland
  • 1,824
  • 2
  • 26
  • 43
1

A more easier approach is using LIKE as the operator

SELECT * FROM WHERE jsonColumn->'$.value' LIKE LOWER('NULL');

and LOWER is used to validate if it is null or NULL because in my testing it did differ

modeht
  • 11
  • 1