0

I have the common situation, but I can't find a common solution...

Imagine I have two tables like authors and corresponding books:

Authors

+----+-----------+
| id | name      |
+----+-----------+
|  1 | O.Connor  |
|  2 | F.Myler   |
+----+-----------+

Books

+----+-----------+-----------+----------------+
| id | author_id | title     | number_of_pages|
+----+-----------+-----------+----------------+
|  1 | 1         |Book 1     |315             |
|  2 | 1         |Book 2     |265             |
+----+-----------+-----------+----------------+

I want to get the data from database in one sql query (I use mySQL but it doesn't matter, I would like to have a database independent solution) with JOIN like:

SELECT A.*, B.title as book_title, B.number_of_pages as book_pages FROM authors A LEFT JOIN books B ON B.author_id=A.id

I would like to avoid multiple queries in a loop because of productivity issues on big databases. And then I need to nest the data from the second table as a subarray to avoid having multiple copy of the same data. In other words I need to transform a flat result from database to a nested multidimensional array. It should give something like:

[
    {
        "id": 3,
        "name": "O.Connor",
        "birth_date": "05.06.1985",
        "from": "England",
        "mother": "",
        "father": "",
        "email": "email@example.com",
        "books": [
            {
                "title": "Some Title",
                "pages": 235
            },
            {
                "title": "Some Title",
                "pages": 267
            },
            {
                "title": "Some Title",
                "pages": 317
            },
            {
                "title": "Some Title",
                "pages": 235
            },
            {
                "title": "Some Title",
                "pages": 298
            }
        ]
    },
    {
        "id": 28,
        "name": "O.Henri",
        "birth_date": "05.06.1300",
        "from": "England",
        "mother": "",
        "father": "",
        "email": "email@example.com",
        "books": [
            {
                "title": "Some Title",
                "pages": 235
            },
            {
                "title": "Some Title",
                "pages": 267
            },
            {
                "title": "Some Title",
                "pages": 317
            },
            {
                "title": "Some Title",
                "pages": 235
            },
            {
                "title": "Some Title",
                "pages": 298
            }
        ]
    }
]

The only related resources I've found so far are:

SQL query with join to get nested array of objects

https://phpdelusions.net/pdo_examples/nested_array

The first one uses either multiple queries or some SQL Server functionality that is not available in MySQL. The second one proposes to fetch each row from the result separately and construct the resulting array on-the-go. Maybe it's not a bad solution, but maybe there are better ones, more productive.

There are some fetch modes in PDO, like FETCH_GROUP etc. But nothing is enough for my task.

I wrote my own function with an idea to create further some DB class and insert this function as a method (so it should be really universal for most cases).

function group_by($array, $criterias=NULL, $group_name='group') {
    if ($criterias == NULL) return $array; // add also a verification of type
    // transform input parameters for array_diff_key()
    $criterias = array_flip($criterias);
    $group = array($group_name=>0);

    $result_array = [];
    $push_index = 0;
    foreach ($array as $row) {
        $sup_array = array_diff_key($row, $criterias);
        $sub_array = array_intersect_key($row, $criterias);
        // add verification of not NULL elements in future
        $isInArray = false;

        foreach ($result_array as $index => $grouped_array) {
            // if there is no difference between arrays then this sub-array is already present in a result array.
            $array_without_groups = array_diff_key($grouped_array, $group);
            if ( empty(array_diff_assoc($array_without_groups, $sup_array)) && (count($array_without_groups)==count($sup_array)) ) {
                $isInArray = true;
                $push_index = $index;
                break;
            }
        }
        if ($isInArray) {
            array_push($result_array[$push_index][$group_name], $sub_array);
        } else {
            $sup_array[$group_name] = array();
            array_push($sup_array[$group_name], $sub_array);
            array_push($result_array, $sup_array);
        }
    }

    return $result_array;
}

And then you call this on the fetchAll result as:

$result = group_by($result, array('book_title', 'book_pages'), 'books');

I don't think it's the best solution. I am a noobie to PHP, so I'm not even sure if array_diff and array_intersect that I use are the efficient way to extract one array from another. I would be thankful for any advice.

  • Use the author ID as _key_ for your result array, then you can easily collect the relevant data under that. (You can use `array_values` on the whole result later, to “reset” those keys to a simple, zero-based index.) – CBroe May 11 '21 at 13:53
  • The task is not hard and needs in accuracy only. Provide a sample - CREATE TABLE for all tables, INSERT INTO with sample data (3-5 rows per table) and desired output (MySQL output, not its representation on PHP). Also specify precise MySQL version. – Akina May 11 '21 at 13:54
  • Are there some online resources that help to create quickly an example? – GZstudio GZstudio May 11 '21 at 14:14
  • This [meta question](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query) has some good instructions for that in the answer. – El_Vanja May 11 '21 at 14:18
  • 1
    I think you're worried that a single query will repeat authors' information for each book, because of SQL deals only in rectangular tables of data. I guess you're concerned that all those extra repetitions of the author info are wasteful. **Don't worry about that.** Seriously. Database systems are optimized for it. Your code to format the stuff is fine. Multiple queries are much more wasteful than some redundant information in your result set. – O. Jones May 11 '21 at 14:23
  • @O.Jones , I agree with you. – Supersonic May 25 '21 at 08:56

1 Answers1

5

Here's a solution that works in MySQL 5.7 or later.

SELECT A.id, A.name, A.birth_date, A.`from`, A.mother, A.father, A.email, 
  JSON_ARRAYAGG(JSON_OBJECT('title', B.title, 'pages', B.number_of_pages)) AS books
FROM authors A LEFT JOIN books B ON B.author_id=A.id
GROUP BY A.id;
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • 1
    Thank's a lot! I read about this function. The problem is that the project that I maintain uses MySQL 5.6. And it's not quite simple to pass to 5.7 due to legacy code and lack of time (and money :) Maybe I'm inventing the wheel, but I want to write this sort of function that will do all the work. Would you be so kind to rewise it a little bit and maybe you could propose some improvements. Thank you. – GZstudio GZstudio May 14 '21 at 07:52
  • The workaround if you're still on MySQL 5.6 is what O. Jones commented above. Get the data as-is, and reformat it in application code after you fetch it. – Bill Karwin May 14 '21 at 14:19
  • 1
    If you think it's too time-consuming and expensive to upgrade to MySQL 5.7, just imagine how expensive it would be to write and maintain the spaghetti code in SQL needed to format JSON. Don't do it! – Bill Karwin May 14 '21 at 14:20