3

I have a MySQL table with the following definition:

mysql> desc person;
+--------+---------+------+-----+---------+-------+
| Field  | Type    | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+-------+
| id     | int(11) | NO   | PRI | NULL    |       |
| name   | text    | YES  |     | NULL    |       |
| fruits | json    | YES  |     | NULL    |       |
+--------+---------+------+-----+---------+-------+

The table has some sample data as follows:

mysql> select * from person;
+----+------+----------------------------------+
| id | name | fruits                           |
+----+------+----------------------------------+
|  1 | Tom  | ["apple", "orange"]              |
|  2 | John | ["apple", "mango"]               |
|  3 | Tony | ["apple", "mango", "strawberry"] |
+----+------+----------------------------------+

How can I calculate the total number of occurrences for each fruit? For example:

+------------+-------+
| fruit      | count |    
+------------+-------+
| apple      | 3     |
| orange     | 1     |
| mango      | 2     | 
| strawberry | 1     |
+------------+-------+

Some research shows that the JSON_LENGTH function can be used but I cannot find an example similar to my scenario.

Arjun
  • 817
  • 3
  • 16
  • 28

3 Answers3

8

You can use JSON_EXTRACT() function to extract each value ("apple", "mango", "strawberry" and "orange") of all three components of the arrays, and then then apply UNION ALL to combine all such queries:

SELECT comp, count(*)
FROM
(
 SELECT JSON_EXTRACT(fruit, '$[0]') as comp FROM person UNION ALL
 SELECT JSON_EXTRACT(fruit, '$[1]') as comp FROM person UNION ALL
 SELECT JSON_EXTRACT(fruit, '$[2]') as comp FROM person 
) q
WHERE comp is not null
GROUP BY comp

Indeed If your DB's version is 8, then you can also use JSON_TABLE() function :

SELECT j.fruit, count(*)
  FROM person p
  JOIN JSON_TABLE(
                 p.fruits,
                '$[*]' columns (fruit varchar(50) path '$')
       ) j
GROUP BY j.fruit;

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • Thank. I have a question - how can I use the `JSON_EXTRACT` function if the maximum number of row elements are unknown? – Arjun Dec 11 '19 at 20:33
  • @Arjun you're welcome. Indeed I advised JSON_TABLE because of this problem :) I don't know and don't think whether a dynamic method exists for JSON_EXTRACT. But luckily you have version 8. – Barbaros Özhan Dec 11 '19 at 20:35
1

You can't do it without first creating a table with one row per fruit.

CREATE TABLE allfruits (fruit VARCHAR(10) PRIMARY KEY);
INSERT INTO allfruits VALUES ('apple'), ('orange'), ('mango'), ('strawberry');

There is not a good way to generate this from the JSON.

Once you have that table, you can join it to the JSON and then use GROUP BY to count the occurrences.

SELECT fruit, COUNT(*) AS count
FROM allfruits
JOIN person ON JSON_SEARCH(person.fruits, 'one', fruit) IS NOT NULL
GROUP BY fruit;

Output:

+------------+-------+
| fruit      | count |
+------------+-------+
| apple      |     3 |
| mango      |     2 |
| orange     |     1 |
| strawberry |     1 |
+------------+-------+

Note that it will do a table-scan on the person table to find each fruit. This is pretty inefficient, and as your person table gets larger, it will become a performance problem.

If you want to optimize for this type of query, then you shouldn't use JSON to store an array of fruits. You should store data in a normalized way, representing the many-to-many relationship between persons and fruits with another table.

This is related to my answer to Is storing a delimited list in a database column really that bad?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks for your answer. I understand your point about normalization and avoiding JSON for such a use-case. However, I am working with data where the majority can be well-fed into a relational database but there are some crucial fields in JSON. – Arjun Dec 11 '19 at 20:36
1

I think the simplest solution would be to use JSON_TABLE function.

The query you need is


select ft.fruit, count(ft.fruit) from person,
json_table(
  fruits,
  '$[*]' columns(
     fruit varchar(128) path '$'
    ) 
  ) as ft
  group by ft.fruit
  ;

You can find working example in this dbfiddle Fruit demo

Alexey
  • 2,388
  • 1
  • 16
  • 32