1

I have a database where I store fishing licenses. The licenses have from- and to-dates, and I'm using a calendar table to count the amount of sold licenses for each day. Each license has a defined licensetype. The licensetype can be valid for one or more fishing zones.

I'm trying to build a query which shows some columns from the zones table and how many licenses have been sold grouped by day and fishing zone, and filtered by licensetype. I've made it work for the days where licenses have been sold. However, the days where no licenses have been sold yet show only null values.

I've been twisting my head around this for 8 hours now. I'm sure there's an easy solution, I just can't see it. SQL Fiddle here.

I think the schema is quite self-explanatory, so I won't include it here. Please see the fiddle if needed.

SET @licensetype = 1,
    @fromdate = '2019-01-01',
    @todate = '2019-01-10';

SELECT zoneID,
   dy,
   seasonmax,
   daymax,
   COUNT(lID) AS sold
FROM
( SELECT z.zoneID,
       z.seasonmax,
       z.daymax,
       l.ID AS lID,
       l.From,
       l.To,
       lt.ValidForZone
FROM zones z
LEFT JOIN licensetypes lt ON z.zoneID IN(lt.ValidForZone)
LEFT JOIN licenses l ON lt.ID = l.TypeID
WHERE FIND_IN_SET( z.zoneID,
                   ( SELECT lt2.ValidForZone
                    FROM licensetypes lt2
                    WHERE lt2.ID = @licensetype ) ) ) derived
RIGHT JOIN calendar ON calendar.dy >= DATE_FORMAT(derived.From, '%Y-%m-%d')
AND calendar.dy < DATE_FORMAT(derived.To, '%Y-%m-%d')
WHERE calendar.dy >= @fromdate
AND calendar.dy <= @todate
GROUP BY dy,
     zoneID
ORDER BY dy
GMB
  • 216,147
  • 25
  • 84
  • 135
aanders77
  • 620
  • 8
  • 22
  • 1
    What should it show instead of null values on those dates? Isn't that the point of using `RIGHT JOIN calendar`, so that it will show all calendar dates, even the ones that don't match anything in the first query? – Barmar Jan 15 '19 at 21:35
  • Do you want to show `0` instead of `NULL`? Use `IFNULL(COUNT(lID), 0) AS sold`. – Barmar Jan 15 '19 at 21:36
  • 1
    But there's no zone ID if nothing has been sold, what do you want to show instead of `NULL`? Use `IFNULL()` to replace the null value with whatever you want instead. – Barmar Jan 15 '19 at 21:38
  • 1
    Hi, unrelated: I dont think you should insert (1, '70,80') into licensetypes, see https://en.wikipedia.org/wiki/First_normal_form – user3813234 Jan 15 '19 at 21:38
  • Yes, you're right about the RIGHT JOIN. I need it to show one row per zone per day, with zonenumber, seasonmax and daymax from the zones table. – aanders77 Jan 15 '19 at 21:39
  • @Barmar it should be possible to show the zoneID, seasonmax and daymax even if there's no sold licenses. Those values are taken from the zones table, which is joined to the licensetype table. I'm using licensetype ID in the WHERE clause, so it should be possible without sold licenses. – aanders77 Jan 15 '19 at 21:41
  • If there's nothing sold on that date, which zone ID should it show for that date? Do you want to show a row for every zone, with a zero count for the ones that weren't sold on that day? – Barmar Jan 15 '19 at 21:50
  • It should show multiple rows per date, one for each zone ID defined in the licensetype table. The count should be zero, and the other columns should have the values from the zones table. (The real tables of course has many more columns than the fiddle. I will use this as a subquery for further calculations). – aanders77 Jan 15 '19 at 21:56

3 Answers3

2

If you want a row for every zone, even the ones that weren't sold on that day, you can take the zones table out of the subquery. You want to start with a CROSS JOIN between the zones and calendar table to get all combinations of zones and dates. Then LEFT JOIN that with the license information to get the license counts.

Also, to join with a comma-separated list you need to use FIND_IN_SET, not IN(). See Search with comma-separated value mysql.

SELECT z.zoneID, c.dy, z.seasonmax, z.daymax, IFNULL(COUNT(l.ID), 0) AS sold
FROM zones AS z
CROSS JOIN calendar AS c
JOIN licensetypes AS lt ON FIND_IN_SET(z.zoneID, lt.ValidForZone)
LEFT JOIN licenses AS l ON lt.ID = l.typeID AND c.dy >= DATE_FORMAT(l.From, '%Y-%m-%d') AND c.dy < DATE_FORMAT(l.To, '%Y-%m-%d')
WHERE c.dy BETWEEN @fromdate AND @todate
    AND lt.ID = @licensetype
GROUP BY z.zoneID, c.dy
ORDER BY dy
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Thanks, there's so much I have yet to learn! :) However the count in your query isn't correct. On Jan 3, there should be 1 license valid for zone 70 and 2 valid for zone 80. On Jan 4, there should be 2 valid for zone 70 and 3 for zone 80. On Jan 5, there should be 1 valid in zone 70 and 1 in zone 80. – aanders77 Jan 15 '19 at 22:23
  • Sorry, I didn't notice until now that my own query also gives the wrong count :/ – aanders77 Jan 15 '19 at 22:36
  • I think I see the problem. `lt.ID = @licensetype` should only be used when selecting the zones to show, it shouldn't be used when getting the license count. Is that right? I don't have time to update the query now, but it requires joining with `licensetypes` twice: once to get the zones, another to get all the licenses. – Barmar Jan 15 '19 at 22:49
  • Yes, that's right. The licenses can be retrieved by `l.TypeID = @licensetype`. I don't think I'm able to wrap my head around this, so I'm looking forward to your update. – aanders77 Jan 15 '19 at 22:56
  • Forgive me, your query gives the right count. It's me who had a brainfart and forgot that the query filters on licensetype (as it should). – aanders77 Jan 15 '19 at 23:19
1

Thanks for the db fiddle. I have largely modified your query, as per following guidelines :

  • The subquery does not seem necessary : the computation can be achieved in a single, GROUP BY query
  • You need a cartesian product to generate the list of all dates for all zones, hence FROM calendar CROSS JOIN zones ; the general filters coming from the parameters, that apply to calendarand zones, and can be placed in the WHERE clause
  • Every filtering condition on LEFT JOINed tables should be moved to the ON clause of the relevant join
  • it is more efficient to use the DATE function than DATE_FORMAT to trunc a datetime to date

New query (see the db fiddle) :

SELECT z.zoneID, calendar.dy, z.seasonmax, z.daymax, COUNT(DISTINCT l.ID) Asold
FROM 
    calendar
    CROSS JOIN zones z
    LEFT JOIN licensetypes lt ON z.zoneID IN(lt.ValidForZone)
    LEFT JOIN licenses l 
        ON  lt.ID = l.TypeID 
        AND calendar.dy BETWEEN DATE(l.From) AND DATE(l.To)
WHERE
    FIND_IN_SET( z.zoneID, ( SELECT lt2.ValidForZone FROM licensetypes lt2 WHERE lt2.ID = @licensetype ) )
    AND calendar.dy >= @fromdate
    AND calendar.dy <= @todate
GROUP BY z.zoneID, calendar.dy, z.seasonmax, z.daymax
ORDER BY dy

NB : this query does not properly takes in account the case where multiple comma-separated values are stored in licensetypes.ValidForZone : when several licence types match, only the first one is counted (just like your original query). It would be better to normalize your database and use a bridge table to store the 1-N relation between license types and zones, instead of stuffing N values in a single field...

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thanks, there's so much I have yet to learn! :) However the count in your query isn't correct. On Jan 3, there should be 1 license valid for zone 70 and 2 valid for zone 80. On Jan 4, there should be 2 valid for zone 70 and 3 for zone 80. On Jan 5, there should be 1 valid in zone 70 and 1 in zone 80. – aanders77 Jan 15 '19 at 22:23
  • 1
    @aanders77 : I see your point (your original query had the same issue so I did not pay attention). I am afraid that it might be simpler to fix your database design rather than tweaking the query... I updated my answer. – GMB Jan 15 '19 at 22:35
  • Sorry, I didn't notice until now that my own query also gives the wrong count. – aanders77 Jan 15 '19 at 22:35
1

I think this query should give you the desired results (which seems to be 1 for 3/1, 2 for 4/1 and 5/1). You have to start from a CROSS JOIN of days and zones, then LEFT JOIN to licensetypes and licenses on the relevant license type and days:

SELECT z.zoneID, DATE(c.dy) AS dy, z.seasonmax, z.daymax, COUNT(l.ID) AS sold
FROM calendar c
CROSS JOIN zones z
LEFT JOIN licensetypes t ON t.ID = @licensetype AND FIND_IN_SET(z.zoneID, t.ValidForZone)
LEFT JOIN licenses l ON l.TypeID = t.ID AND DATE(c.dy) >= DATE(l.From) AND DATE(c.dy) < DATE(l.To)
WHERE DATE(c.dy) BETWEEN @fromdate AND @todate
GROUP BY dy, z.zoneID

Demo on SQLFiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
  • Thanks for the answer. However I think your count is wrong? (As I commented on the two other answers, the count is also wrong in my own fiddle). If you check the licenses table, there are 3 licenses. Two of them are valid on Jan 3. No 1 is valid only in zone 80, while the other one is valid in both zones (70 and 80). So the count on Jan 3 should be 1 for zone 70 and 2 for zone 80. Also, the "to"-date should not be included in the count. – aanders77 Jan 15 '19 at 23:02
  • @aanders77 the count for the license only valid in zone 80 is not included because you wanted to filter by @licensetype = 1. Is that not what you want? When you refer to the `to` date, do you mean for the licenses table, the @todate variable, or both? – Nick Jan 15 '19 at 23:05
  • Of course, you are right! It's 10 minutes past midnight here now, and I've been working on this since around 3 this afternoon. Maybe it's time to take a break! About the `to` date, I mean in the licenses table. – aanders77 Jan 15 '19 at 23:09
  • @aanders77 I've updated my answer based on your comments about the `to` date. It still checks on the `@licensetype`, you can simply remove that part of the `ON` clause and I think you'll see it gives the correct results in that case too. – Nick Jan 15 '19 at 23:12