0

Update 2: I have resolved the two initial problems. But another one popped out when I made the changes suggested (scroll to "UPDATE").

I use MySQL 5.7. Update: now migrated to MySQL 8.0

I have this table:

+------+
| unit |
+------+
| 1000 |
| 1040 |
| 1555 |
| 1600 |
| 3020 |
| 7000 |
+------+

And I use the following stored procedure to find gaps between the ranges. It is useful, because it has good performance and stops when the size of the gap is the minimum we desired:

CREATE DEFINER=`development`@`%` PROCEDURE `stack_overflow_question`(req_range_start VARCHAR(32), IN req_range_end VARCHAR(32), IN req_quantity INT(11))
BEGIN
    SET @range_start = req_range_start;
    SET @range_end = req_range_end;
    SET @demanded_quantity = req_quantity;  # Implementation for calculation of gap size.
    SET @quantity_aggregated = 0;  # Implementation for calculation of gap size.

    SELECT
        interval_start,
        interval_end,
        @quantity := interval_end - interval_start + 1 AS quantity,
        @quantity_aggregated := @quantity_aggregated + @quantity AS quantity_aggregated
    FROM (
        SELECT
            @interval_start := @rownum := @rownum + 1 AS interval_start,
            @interval_end := IF (@rownum = unit, 0, @rownum := IF ((unit <= @range_end), unit, @range_end + 1)) - 1 AS interval_end
        FROM (
            SELECT @rownum := (SELECT MIN(@range_start - 1) FROM item)
        ) AS a
        JOIN item
        ORDER BY unit
    ) AS z
    WHERE NOT interval_end <= 0 AND interval_start <= @range_end
    HAVING @quantity_aggregated <= @demanded_quantity; # This makes the query stop when we have reached the desired size of gap.
END

We have three scenarios: one successful and the other troublesome.

Successful example:

call inventory.stack_overflow_question('1000', '8000', 100); # Works well, we requested 100 and it stopped at the minimum required amount.

Result:

+----------------+--------------+----------+---------------------+
| interval_start | interval_end | quantity | quantity_aggregated |
+----------------+--------------+----------+---------------------+
|           1001 |         1039 |       39 |                  39 |
|           1041 |         1554 |      514 |                 553 |
+----------------+--------------+----------+---------------------+

Failed example, 1: The last row is not taken into account. FIXED

call inventory.stack_overflow_question('1000', '9000', 10000); # Produces interval_end to be 6999, but should be 9000 because we are requesting 10,000 gap size and should assume 9000 is the end.

Result:

+----------------+--------------+----------+---------------------+
| interval_start | interval_end | quantity | quantity_aggregated |
+----------------+--------------+----------+---------------------+
|           1001 |         1039 |       39 |                  39 |
|           1041 |         1554 |      514 |                 553 |
|           1556 |         1599 |       44 |                 597 |
|           1601 |         3019 |     1419 |                2016 |
|           3021 |         6999 |     3979 |                5995 |
+----------------+--------------+----------+---------------------+

Desired result:

+----------------+--------------+----------+---------------------+
| interval_start | interval_end | quantity | quantity_aggregated |
+----------------+--------------+----------+---------------------+
|           1001 |         1039 |       39 |                  39 |
|           1041 |         1554 |      514 |                 553 |
|           1556 |         1599 |       44 |                 597 |
|           1601 |         3019 |     1419 |                2016 |
|           3021 |         6999 |     3979 |                5995 |
|           7001 |         9000 |     2000 |                7995 | < additional row showing the end of the range.
+----------------+--------------+----------+---------------------+

Failed example: FIXED

call inventory.stack_overflow_question('1000', '1000', 10000); # I don't even know what is happening here. Since we want to find between 1000 and 1000 and there is no gap, it should return nothing.

Result:

+----------------+--------------+----------+---------------------+
| interval_start | interval_end | quantity | quantity_aggregated |
+----------------+--------------+----------+---------------------+
|           1001 |         1000 |        0 |                   0 |
|           1002 |         1000 |       -1 |                  -1 |
|           1002 |         1000 |       -1 |                  -2 |
|           1002 |         1000 |       -1 |                  -3 |
|           1002 |         1000 |       -1 |                  -4 |
+----------------+--------------+----------+---------------------+

FIXED: so I fixed this one adding AND interval_start <= @range_end to the WHERE clause in the SQL script.

So I simply wonder how to make it work properly and overcome the failed scenarios. Any ideas?


UPDATE AFTER APPLYING PROPOSED SOLUTION:

After making the upgrade to MySQL 8.0, applied Anonymous' solution, and it fixed the problem that the last interval will not fill up to the end... but now I have problems because the first interval is not being started from the range_start as supposed.

Is driving me crazy how much time such a ridiculously simple solution that should be easy to write is taking so much time. I beg for help!!

Scenario:

Now we have this table:

+-------------+
| unit        |
+-------------+
| 3079        |
| 3130        |
| 3185        |
+-------------+

And the solution proposed by Anonymous is implemented:

SET @range_start = 3000;
SET @range_end = 6000;

SELECT
    unit + 1 AS interval_start,
    LEAD(unit, 1, @range_end + 1) OVER (ORDER BY unit) - 1 AS interval_end,
    ((LEAD(unit, 1, @range_end + 1) OVER (ORDER BY unit) - 1) - (unit + 1)) + 1 AS quantity
FROM item
WHERE (
    unit <= @range_end
)
ORDER BY unit

Problem: The first interval should count from the range_start, this is: 3000.

+----------------+--------------+----------+
| interval_start | interval_end | quantity |
+----------------+--------------+----------+
|           3080 |         3129 |       50 |
|           3131 |         3184 |       54 |
|           3186 |         6000 |     2815 |
+----------------+--------------+----------+

Expected output:

+----------------+--------------+----------+
| interval_start | interval_end | quantity |
+----------------+--------------+----------+
|           3000 |         3078 |       79 | < I can't manage to recreate this.
|           3080 |         3129 |       50 |
|           3131 |         3184 |       54 |
|           3186 |         6000 |     2815 |
+----------------+--------------+----------+
UserJ
  • 159
  • 3
  • 13
  • As I may have already mentioned, consider handling issues of data display in application code – Strawberry Jun 05 '20 at 09:52
  • @Strawberry of course that is always an option :) and I am currently doing that. However to me is more intuitive to have as much logic as possible encapsulated in the Stored Procedure. – UserJ Jun 05 '20 at 12:12

2 Answers2

1

The Mysql window functions can simplify the job. Interesting functions are LAG or LEAD depending on whether you want previous or next values from your recordset.

Here is an attempt.

Assuming you have a recordset like:

create table test(unit int);
insert test(unit) values (1000);
insert test(unit) values (1040);
insert test(unit) values (1555);
insert test(unit) values (1600);
insert test(unit) values (3020);
insert test(unit) values (7000);

And you want:

+----------------+--------------+----------+---------------------+
| interval_start | interval_end | quantity | quantity_aggregated |
+----------------+--------------+----------+---------------------+
|           1001 |         1039 |       39 |                  39 |
|           1041 |         1554 |      514 |                 553 |
|           1556 |         1599 |       44 |                 597 |
|           1601 |         3019 |     1419 |                2016 |
|           3021 |         6999 |     3979 |                5995 |
|           7001 |         9000 |     2000 |                7995 | additional row showing the end of the range.
+----------------+--------------+----------+---------------------+

You can do this:

SET @max_value = 9000;
SET @runtot:=0; -- running sum

SELECT interval_start, interval_end, quantity,
(@runtot := @runtot + a.quantity) AS quantity_aggregated
FROM (
    SELECT unit+1 AS interval_start,  LEAD(unit, 1, @max_value+1) OVER (ORDER BY unit) -1 AS interval_end,
    ( (LEAD(unit, 1, @max_value+1) OVER (ORDER BY unit) -1) - (unit+1) ) +1 AS quantity
    FROM test
    ORDER BY unit
) a;

Result:

interval_start  interval_end    quantity    quantity_aggregated
          1001          1039          39                     39
          1041          1554         514                    553
          1556          1599          44                    597
          1601          3019        1419                   2016
          3021          6999        3979                   5995
          7001          9000        2000                   7995

This should be fairly easy to integrate in your stored procedure. Note that I have used a subquery to calculate the running sum, based on this post.

@max_value serves to fill up the last record...

Probably you can prettify this query. I am not sure if it is possible to reuse the calculated fields with aliases since the platform is Mysql.

I can't guarantee performance on a big recordset but should not be worse than alternatives.

Not sure about your Mysql version.

Kate
  • 1,809
  • 1
  • 8
  • 7
  • I will consider upgrade to MySQL 8.0 and test it, however I am not sure if upgrading to MySQL 8.0 is a good idea since I have an old third-party application (year 2009) that gave me a little trouble last time on MySQL 8.0. I know this is a different issue but any ideas on this? How typicall are compatibility issues if I upgrade? – UserJ Jun 06 '20 at 05:43
  • I have had to unmark your reply as the answer for now since now I have the opposite problem after implementing your solution: is not starting from range_start. – UserJ Jun 07 '20 at 14:55
0

In older versions of MySQL, you can use:

select @range_start, min(unit) - 1
from items
where unit >= @range_start
having min(unit) > @range_start
union all
select max(unit) + 1, @range_end
from items
where unit <= @range_end
having max(unit) < @range_end
union all
select unit, unit
from items
union all
select unit + 1, next_unit
from (select i.unit, 
             (select i2.unit
              from items i2
              where i2.unit > i.unit
              order by i2.unit desc
              limit 1
             ) as next_unit
      from items i
     ) i
where next_unit > unit + 1 and next_unit < @range_end
order by 1;

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you. But this is not the answer to the particular issue, rather to the previous question I published. I changed the scenario and I do not want to include the individual entries I just want the things to display correctly. Also, using UNION ALL has very high cost. Sorry I confused you. – UserJ Jun 05 '20 at 11:42