0

This is a little more complicated than the usual how do I display records from one table which are not in the other.

Rather, each record from one table should appear 'x' number of times in the other.

The table definitions and data are as follows:

CREATE TABLE `time_code` (
  `ID` bigint NOT NULL AUTO_INCREMENT,
  `AREA_ID` bigint NOT NULL,
  `TIME_CODE` varchar(10) NOT NULL,
  `DESCRIPTION` varchar(255) NOT NULL,
  `FISCAL_YEAR` bigint NOT NULL,
  PRIMARY KEY (`ID`)
  )

Id, Area, Timecode, Description, Fiscal Year
1, 51, 101, "Project A", 2022
2, 51, 102, "Project B", 2022
CREATE TABLE `schedule_data` (
  `ID` bigint NOT NULL AUTO_INCREMENT,
  `AREA_ID` bigint NOT NULL,
  `TERRITORY_ID` bigint NOT NULL,
  `GROUP_ID` bigint NOT NULL,
  `TIMECODE_ID` bigint NOT NULL,
  `CALENDAR_MONTH` bigint NOT NULL,
  `FISCAL_YEAR` bigint not null,
  PRIMARY KEY (`ID`)
  )
Id, Area, Territory, Group, Timecode ID, Month, Fiscal Year
1, 51, 52, 53, 1, 2, 2022
2, 51, 52, 54, 1, 2, 2022
3, 51, 52, 55, 1, 2, 2022
4, 51, 52, 53, 2, 2, 2022

In this example, time code 101 appears in each of the groups (53, 54, and 55). Time code 102 appears only in group 53. It is missing from groups 54 and 55.

Each timecode must appear in the schedule_data table for each group (3 times, but it could vary depending upon the number of groups)

How do I write a query that tells me which groups are missing time codes?

Time codes apply to an entire area and the code must exist exactly once for each group...even if the values being tracked are zero.

Edit to add org code table and sample records:

// adjacency list model
CREATE TABLE `organization_map` (
  `ID` bigint NOT NULL AUTO_INCREMENT,
  `ORG_CODE` bigint NOT NULL,
  `NAME` varchar(255) NOT NULL,
  `PARENT_ID` bigint NOT NULL,
  `FISCAL_YEAR` bigint NOT NULL,
  PRIMARY KEY (`ID`)
)
Id, Org Code, Name, Parent Id, Fiscal Year
51, 1, "Area 1", 0, 2022
52, 1, "Territory 1", 51, 2022
53, 5, "Group 1", 52, 2022
54, 6, "Group 2", 52, 2022
55, 7, "Group 3", 52, 2022
56, 2, "Territory 2", 51, 2022
57, 14, "Group 4", 56, 2022
58, 15, "Group 5", 56, 2022
Brian B.
  • 9
  • 3

2 Answers2

0

UPDATE, solution found

Created a view of the schedule data to include the organization info for easier query. Then wrote a query to represent the time codes down to the group level. Next, used a not in sub-query to find the time codes that are not present in the view.

select
    a.id as area_id, t.id as territory_id, g.id as group_id, tc.time_code
from 
    time_code tc
left join
    organization_map a on a.id = 51
left join
    organization_map t on t.id = 52
left join
    organization_map g on g.id in
    (
        58, 59, 60, 61, 62, 63, 64
    )
where
    tc.FISCAL_YEAR = 2022 and
    tc.TIME_CODE not in
    (
        select
            v.time_code
        from
            schedule_data_view v
        where
            v.calendar_month = 10 and
            v.FISCAL_YEAR = 2021 and
            v.AREA_ID = a.id and
            v.TERRITORY_ID and
            v.group_id = g.id
    )
group by
    a.id, t.id, g.id, tc.TIME_CODE;
Brian B.
  • 9
  • 3
0

Basically you can cross join the groups and the codes to get all possible combinations. Then you can use NOT EXISTS and a correlated subquery to find the combinations that are not in the schedule.

SELECT om.id organization_map,
       tc.id time_code
       FROM organization_map AS om
            CROSS JOIN time_code AS tc
       WHERE NOT EXISTS (SELECT *
                                FROM schedule_data AS sd
                                WHERE sd.group_id = om.id
                                      AND sd.timecode_id = tc.id);

I'm not sure about those other columns like fiscal_year, calendar_month etc.. I think that they are just redundant and shouldn't even exist but I don't know what exactly you're modeling here. So they might have their places and be part of the keys. In the latter case, you may need to extend the query using these key columns as well.

sticky bit
  • 36,626
  • 12
  • 31
  • 42