1

I'm using MySQL (5.5.65-MariaDB) and have an application which consists of a table, generated_text_tbl. The following is some sample data from this table.

-----------------------------------------------
id | substance_id | display_id | generated_text   
-----------------------------------------------
1  | 54           | 139        | foo
-----------------------------------------------
2  | 54           | 139        | bar
----------------------------------------------
3  | 1933         | 139        | baz
-----------------------------------------------  
4  | 38           | 27         | xyz
-----------------------------------------------

id is an AUTO_INCREMENT field. The substance_id and display_id are foreign keys which refer to records in other tables (substances.id and display.id respectively).

I want to output the text from the generated_text field - separated with a HTML line break character (<br>) - if where matching substance_id and display_id rows exist. If there is no match (i.e. an individual row) then generated_text should still be output, but does not need the <br> character since there is nothing to separate.

I have a solution to this problem which works in PHP but I want a pure SQL solution if possible. I've written out the logic I have since this technically works, albeit in PHP.

For the example data above the correct output would be as follows:

  • foo<br>bar : because there are 2 rows where the substance_id and display_id match (both are 54 and 139 respectively).
  • baz : there is only 1 row WHERE substance_id = 1933 AND display_id = 139.
  • xyz: same logic as above, there is only 1 row WHERE substance_id = 38 AND display_id = 27.

Logically my PHP script works like this:

  1. All unique substance_id's are loaded into an array. Equivalent to SQL SELECT DISTINCT(substance_id) FROM generated_text_tbl.

  2. All unique display_id's are loaded into an array. Equivalent to SQL SELECT DISTINCT(display_id) FROM generated_text_tbl.

  3. Loop through (1) with an inner loop on (2). This effectively cycles through every substance_id and display_id combination, i.e.:

    • substance_id = 54 AND display_id = 139 (2 rows)
    • substance_id = 54 AND display_id = 27 (0 rows)
    • substance_id = 1933 AND display_id = 139 (1 row)
    • substance_id = 1933 AND display_id = 27 (0 rows)
    • substance_id = 38 AND display_id = 139 (0 rows)
    • substance_id = 38 AND display_id = 27 (1 row)
  4. It then obtains an array of generated_text based on the condition above. If the array size is greater than 1 element then a <br> character is added between each. If it's just 1 element then a <br> character isn't required because there is only 1 row of generated_text so nothing needs breaking up.

     // Store the output text
     $output = '';
    
     // e.g. $result contains MySQL rows WHERE substance_id = 54 AND display_id = 139
     if (sizeof($result) == 1) {
         $output = $result['generated_text'];
     } else {
     foreach ($result as $key => $value) {
         $output .= $value['generated_text'] . "<br>";
     }
         $output = (substr($output), 0, -4); // Remove trailing <br>
     }
    
  5. The result of var_dump($output) for the above script is foo<br>bar, which is what I want.

Although this logically works it is inefficient (when there are thousands of rows in the generated_text table). Is there a pure SQL solution to this?

Blag
  • 5,818
  • 2
  • 22
  • 45
Andy
  • 5,142
  • 11
  • 58
  • 131

1 Answers1

1

You're after something like that no ?

SQL Fiddle

Query 1:

SELECT 
  `substance_id` , 
  `display_id`, 
  GROUP_CONCAT( `generated_text` SEPARATOR '<br>' ) as concact_text
from generated_text_tbl 
group by 
  `substance_id` , 
   `display_id`

Results:

| substance_id | display_id | concact_text |
|--------------|------------|--------------|
|           38 |         27 |          xyz |
|           54 |        139 |   foo<br>bar |
|         1933 |        139 |          baz |

Query 2:

SELECT 
  `substance_id` , 
  `display_id`, 
  GROUP_CONCAT( `generated_text` ORDER BY id DESC SEPARATOR '<br>') as concact_reverse_text
from generated_text_tbl 
group by 
  `substance_id` , 
   `display_id`

Results:

| substance_id | display_id | concact_reverse_text |
|--------------|------------|----------------------|
|           38 |         27 |                  xyz |
|           54 |        139 |           bar<br>foo |
|         1933 |        139 |                  baz |

Official Documentation : https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_group-concat

Blag
  • 5,818
  • 2
  • 22
  • 45
  • 1
    This works exactly how I need, thanks. Could you add some explanation about how it works? Presumably the `group by` clause is effectively doing the unique `substance_id` and `display_id` combinations that I described in the question? Is this why it doesn't need `distinct()`? – Andy Jun 30 '20 at 14:49
  • 1
    @Andy yes, `distinct` is just a `group by` on every field. Here you ask the DB to make a group by on your two columns and concat with a `
    ` separator if needed the `generated_text`. Usually group by is used for `sum` : you can get the sum for each ID / field you want ;)
    – Blag Jun 30 '20 at 14:55