0

I am struggling with a logical puzzle while working with joins. So I have 2 tables shown below EmpStatus and EmpEIN. So what I want to do is join these 2 tables and produce the output like below. If a consumer had an EmpNumber for a complete month than the should get a status in a different row, else the statuses should go with a null entry in the EmpNumber.

I am not sure how I can join these 2 tables to produce the output. If some sql pro can help me out that will be awesome.

 EmpStatus:
 ConsumerID JanStatus   FebStatus   MarStatus   AprStatus   MayStatus
1001            P           P           P           P           P
1002            P           P           P           P           P
1003            P           P           P           P           P
1004            P           P           P           P           P
1005            P           P           P           P           P
EmpEIN:
ConsumerID  EmpNumber   EmpNumberStartDate  EmpNumberEndDate
1001        102020202       1/1/2015            31/1/2015   --  dates in dd/mm/yyyy format
1001        210201021       1/2/2015            31/3/2015
1002            NULL        NULL                NULL
1003            NULL        NULL                NULL
1004            NULL        NULL                NULL
1005            NULL        NULL                NULL

OUTPUT:
ConsumerID  EmpNumber   JanStaus    FebStatus   MarStatus   AprStatus   MayStatus
1001        102020202       P       NULL        NULL        NULL        NULL
1001        210201021       NULL    P           P           NULL        NULL
1001        NULL            NULL    NULL        NULL        P           P
1002        NULL            P       P           P           P           P
1003        NULL            P       P           P           P           P
1004        NULL            P       P           P           P           P
1005        NULL            P       P           P           P           P
imba22
  • 651
  • 1
  • 13
  • 25
  • The logic for "p" doesn't seem to make any sense - plus there's no date consistency in your end date column. Please provide some clarification and what you have actually tried. – MageeWorld Jul 24 '16 at 23:07
  • @MageeWorld For the sake of this problem, you can assum P to be a constant. Its basically a status. Employee 1001 had EmpNumber 102020202 assigned to it for 1/1/2015 to 31/1/2015 duration so it should have the status P only for columns JanStatus against the row for that EmpNumber in the output. – imba22 Jul 24 '16 at 23:48
  • You have columns for JanStatus etc., but the dates in EmpEIN are for (possibly different) years. Did you mean Jan2015Status, etc.? Then, the number (and possibly names) of the columns in EmpStatus and OUTPUT may vary - or do you have a fixed window, like "Jan 2015 through Dec 2015"? If it's not fixed, you will need dynamic SQL - not a great solution in most cases. –  Jul 25 '16 at 00:30
  • Also, is it assumed that "StartDate to EndDate" is always a period that spans full calendar months? So we can't have an Emp who had the account, for example, from 1/4/2015 through 22/5/2015? (I fixed your input table and stated explicitly that dates are dd/mm/yyyy). –  Jul 25 '16 at 00:35
  • Yes the dates in EmpEIN can be different than. The names can be anything as log as I am getting 3 rows for emp 1001. So let me rephrase the problem, how can I just join the 2 tables to generate the output to get 3 rows for the customer 1001. You can forget about the date manipulation for now. – imba22 Jul 25 '16 at 00:58

3 Answers3

0

I have addressed the hard part of the solution which is to break up the date range into whole months ( and anything that spills over either side ) and check if any of the spans dont cover the full months. Then I flag the status for each month in the MAX(CASE ... WHEN ... END ) stmt and use the fixed status of 'P' or NULL. So in effect I have completely ignored the EmpStatus table. Joining that to the solution below is something you need to do and also some more Testing .

-- create test data . This will be your EmpEIN table
create table DateTable ( ConsumerID  int, EMP_NUM int, from_date date , to_date date)
insert into DateTable VALUES ( 1001, 1, '2/1/2015', '3/31/2015' )

CREATE FUNCTION [dbo].[DaysInMonth](@date datetime)
RETURNS int
AS
BEGIN
    SET @date = DATEADD(MONTH, 1, @date)
    DECLARE @result int = (select DAY(DATEADD(DAY, -DAY(@date), @date)))
    RETURN @result
END
Go


CREATE FUNCTION SplitDates
(
    @from_date Datetime ,
    @to_date   Datetime 
)
        RETURNS 
        @Table_Var TABLE 
        (
            mo_from_date  Datetime, 
            mo_to_date    Datetime,
            Days_In_Month int
        )
AS
BEGIN

;WITH cte_SplitDates AS
        (
        SELECT  @from_date as from_date
              , @to_date   as to_date
              , @from_date AS mo_from_date
              , DATEADD(day, day(@from_date)* -1 + 1, @from_date) AS bom_date
        --   FROM DateTable
        UNION ALL
        SELECT from_date
             , to_date
             , DATEADD(month,1,bom_date)
             , DATEADD(month,1,bom_date)
          FROM cte_SplitDates
         where DATEADD(month,1,mo_from_date) < to_date
        )

    INSERT INTO @Table_Var
    SELECT 
      mo_from_date ,
      CASE when to_date < DATEADD(month,1,bom_date) 
       THEN
           to_date
       ELSE
           DATEADD(day, -1, DATEADD(month,1,bom_date))
       END AS mo_to_date ,
      [dbo].[DaysInMonth](mo_from_date) as Days_In_Month
   FROM cte_SplitDates

  RETURN 

END 

Now the Most important part the Query that uses the above helper functions :

SELECT A.ConsumerID,  A.EMP_NUM , 
        MAX( CASE WHEN A.Mnth =1 AND Days_In_Month = Diff_Start_End_Dates THEN 'P' ELSE NULL  END ) as JanStaus    ,
        MAX( CASE WHEN A.Mnth =2 AND Days_In_Month = Diff_Start_End_Dates THEN 'P' ELSE NULL  END ) as Febtaus     ,
        MAX( CASE WHEN A.Mnth =3 AND Days_In_Month = Diff_Start_End_Dates THEN 'P' ELSE NULL  END ) as MarStaus     ,
        MAX( CASE WHEN A.Mnth =4 AND Days_In_Month = Diff_Start_End_Dates THEN 'P' ELSE NULL  END ) as AprStaus     ,
        MAX( CASE WHEN A.Mnth =5 AND Days_In_Month = Diff_Start_End_Dates THEN 'P' ELSE NULL  END ) as MayStaus     
        -- Other Months can be added just as above 
FROM
(
        SELECT D.ConsumerID, D.EMP_NUM , month( S.mo_from_date) as Mnth,
         S.mo_from_date , S.mo_to_date,  S.Days_In_Month, DATEDIFF( Day, S.mo_from_date , S.mo_to_date) + 1 As Diff_Start_End_Dates 
        FROM  -- DateTable D
        (
            SELECT * FROM DateTable D
            UNION ALL
            Select ConsumerID , NULL As EMP_NUM , DATEADD(D, 1, MAX(to_date) ) , STR(YEAR(MAX(to_date))) + '-12-31' -- This is how the NULL EmpNumber is added.
            FROM DateTable D 
            Group by ConsumerID 
        ) D
        CROSS APPLY dbo.SplitDates(D.from_date, D.to_date) S
) A
Group by A.ConsumerID,  A.EMP_NUM

Explanation: The query uses two helper functions

  1. DaysInMonth -- Gets No of Days in a month for a given date
  2. SplitDates -- Splits Date ranges into chunks of whole months. If date range does not span a whole month ( 7/10 - 7/20 ) then it just returns back the same.

I suggest you play with the Query and the helper functions using the sample DateTable. This table has the same schema as your EmpEIN table. (And Iam not using the other status table as its data is pretty much static in nature)

Note: I borrowed the CTE code from here: https://stackoverflow.com/a/20272758/2628302

Community
  • 1
  • 1
objectNotFound
  • 1,683
  • 2
  • 18
  • 25
  • thank you for taking time and coming up with this. My actual problem is how I can join the 2 tables to get the results. In your solution you are working with only one table and than checking (with your 2 function), if they are in the range to assign P. But I still dont know how I can join the EmpStatus and EmpEin tables, to get 3 rows for employee with id 1001,because only than we can maipulate the status. – imba22 Jul 25 '16 at 00:29
  • thanks again. coming back to the joining part, I tried full outer join like this select * from EmpEin a Full outer join EmpStatus b on a.ConsumerID = b.ConsumerID. The problem is that I need a new row for the same empID only when EmpNumber has a value and EmpNumber start and end date do not encompass the complete whole year from 1/1/2015-31/12/2015. I understand that its hard for you to give suggestion without all the pieces, but any suggestion with joining in good – imba22 Jul 25 '16 at 01:03
  • @kushalbhola ohhh so if for any of the consumerID's there are missing months where there is no corresponding data with an EmpNumber then that has to be reflected with a null EmpNumber (which is your 3rd row in this case) .... Right ? – objectNotFound Jul 25 '16 at 01:11
  • Yes that is correct. Look at the output in my first post. EmpID 1001 has 3 rows. It didnt had any EmpNumber for months Apr and May, so that needs to show up with null empNumber and the P status should go only in AprStatus and MayStatus. I always need to report P, no matter if there is any EmpNumber or not. looking forward to your response – imba22 Jul 25 '16 at 01:28
  • @kushalbhola one more question ... how do we determine for which Calendar year we are reporting ? Do we take the Year from EmpNumberStartDate and EmpNumberEndDate columns ? or is it going to be passed from a GUI . Clarify. – objectNotFound Jul 25 '16 at 01:34
  • Its for 2015 always.There is no dynamic craziness happening. The EmNumberStart and end dates can be anything but we are reporitng on 2015 only. – imba22 Jul 25 '16 at 01:42
  • @kushalbhola check the updated query . I modified the query to generate the NULL EmpNumber for the months where there is no start and end date in the EMPEin table. And Since you said it was ok to assume the status as a fixed value (= 'P') I did not have to create the EmpStatus table. – objectNotFound Jul 25 '16 at 01:55
0

Oracle alternative:

WITH base AS
  ( SELECT
      ee.ConsumerID, ee.EmpNumber,
      CASE 
        WHEN EmpNumber IS NULL THEN JanStatus
        WHEN
          EmpNumberStartDate<=           TO_DATE('01-Jan-'||TO_CHAR(EmpNumberStartDate,'YYYY')) AND 
          EmpNumberEndDate  >=ADD_MONTHS(TO_DATE('01-Jan-'||TO_CHAR(EmpNumberStartDate,'YYYY')),1)-1 
          THEN JanStatus 
        ELSE NULL 
      END JanStatus,
      CASE 
        WHEN EmpNumber IS NULL THEN FebStatus
        WHEN
          EmpNumberStartDate<=           TO_DATE('01-Feb-'||TO_CHAR(EmpNumberStartDate,'YYYY')) AND 
          EmpNumberEndDate  >=ADD_MONTHS(TO_DATE('01-Feb-'||TO_CHAR(EmpNumberStartDate,'YYYY')),1)-1 
          THEN FebStatus 
        ELSE NULL 
      END FebStatus,
      CASE 
        WHEN EmpNumber IS NULL THEN MarStatus
        WHEN
          EmpNumberStartDate<=           TO_DATE('01-Mar-'||TO_CHAR(EmpNumberStartDate,'YYYY')) AND 
          EmpNumberEndDate  >=ADD_MONTHS(TO_DATE('01-Mar-'||TO_CHAR(EmpNumberStartDate,'YYYY')),1)-1 
          THEN MarStatus 
        ELSE NULL 
      END MarStatus,
      CASE 
        WHEN EmpNumber IS NULL THEN AprStatus
        WHEN
          EmpNumberStartDate<=           TO_DATE('01-Apr-'||TO_CHAR(EmpNumberStartDate,'YYYY')) AND 
          EmpNumberEndDate  >=ADD_MONTHS(TO_DATE('01-Apr-'||TO_CHAR(EmpNumberStartDate,'YYYY')),1)-1 
          THEN AprStatus 
        ELSE NULL 
      END AprStatus,
      CASE 
        WHEN EmpNumber IS NULL THEN MayStatus
        WHEN
          EmpNumberStartDate<=           TO_DATE('01-May-'||TO_CHAR(EmpNumberStartDate,'YYYY')) AND 
          EmpNumberEndDate  >=ADD_MONTHS(TO_DATE('01-May-'||TO_CHAR(EmpNumberStartDate,'YYYY')),1)-1 
          THEN MayStatus 
        ELSE NULL 
      END MayStatus
    FROM
      EmpStatus es
        JOIN
      EmpEIN ee ON ee.ConsumerID = es.ConsumerID
  )
SELECT
  b.ConsumerID, TO_NUMBER(NULL) EmpNumber, 
  NVL2(b.JanStatus, NULL, es.JanStatus) JanStatus,
  NVL2(b.FebStatus, NULL, es.FebStatus) FebStatus,
  NVL2(b.MarStatus, NULL, es.MarStatus) MarStatus,
  NVL2(b.AprStatus, NULL, es.AprStatus) AprStatus,
  NVL2(b.MayStatus, NULL, es.MayStatus) MayStatus
FROM
  ( SELECT 
      ConsumerID, 
      MAX(JanStatus) JanStatus, 
      MAX(FebStatus) FebStatus, 
      MAX(MarStatus) MarStatus, 
      MAX(AprStatus) AprStatus, 
      MAX(MayStatus) MayStatus 
    FROM base
    WHERE empnumber IS NOT NULL
    GROUP BY ConsumerID 
  ) b
    JOIN
  EmpStatus es ON es.ConsumerID = b.ConsumerID
UNION ALL
SELECT *
FROM base
ORDER BY ConsumerID, EmpNumber NULLS LAST
Unoembre
  • 535
  • 2
  • 9
0

I offer this solution to illustrate the techniques you need to use. The first table is not in normal form; you need to UNPIVOT it before you can do a join. The column names in the first table contain actual data, which is a huge design flaw. You must recover the data when you UNPIVOT - you will see how I do that in the query. (I hard-coded it, you could use date functions applied to string functions applied to column names, but that didn't seem to me like a big improvement... you will have to "undo" this in the Pivoting at the end, and there you can't avoid hardcoding the column names anyway.)

I changed the "P everywhere" to different letters so it's easier to follow what's going on. The joining is by customerid and by date - you will see this in the query.

with
     empstatus ( ConsumerID, JanStatus, FebStatus, MarStatus, AprStatus, MayStatus ) as (
       select 1001, 'A', 'B', 'C', 'D', 'E' from dual union all
       select 1002, 'F', 'G', 'H', 'I', 'J' from dual union all
       select 1003, 'K', 'L', 'M', 'N', 'O' from dual union all
       select 1004, 'P', 'Q', 'R', 'S', 'T' from dual union all
       select 1005, 'U', 'V', 'W', 'X', 'Y' from dual
     ),
     empein ( ConsumerID, EmpNumber, EmpNumberStartDate, EmpNumberEndDate ) as (
       select 1001, 102020202, date '2015-01-01', date '2015-01-31' from dual union all
       select 1001, 210201021, date '2015-02-01', date '2015-03-31' from dual union all
       select 1002,      NULL, NULL             , NULL              from dual union all
       select 1003,      NULL, NULL             , NULL              from dual union all
       select 1004,      NULL, NULL             , NULL              from dual union all
       select 1005,      NULL, NULL             , NULL              from dual
     ),
     a ( consumerid, mth, val ) as (                                           -- UNPIVOT
       select * from empstatus
       unpivot ( val for mth in (                    JanStatus as date '2015-01-01',
                     FebStatus as date '2015-02-01', MarStatus as date '2015-03-01',
                     AprStatus as date '2015-04-01', MayStatus as date '2015-05-01' ) )
     ),
     j ( consumerid, empnumber, mth, val ) as (                                --    JOIN
       select a.consumerid, e.empnumber, a.mth, a.val
       from a left outer join empein e
              on  a.consumerid = e.consumerid
              and a.mth between e.empnumberstartdate and empnumberenddate
     )
select * from j                                                                --   PIVOT
pivot ( min(val) for mth in (                    date '2015-01-01' as JanStatus,
                 date '2015-02-01' as FebStatus, date '2015-03-01' as MarStatus,
                 date '2015-04-01' as AprStatus, date '2015-05-01' as MayStatus ) )     
order by consumerid, empnumber;

Output:

CONSUMERID  EMPNUMBER JANSTATUS FEBSTATUS MARSTATUS APRSTATUS MAYSTATUS
---------- ---------- --------- --------- --------- --------- ---------
      1001  102020202 A
      1001  210201021           B         C
      1001                                          D         E
      1002            F         G         H         I         J
      1003            K         L         M         N         O
      1004            P         Q         R         S         T
      1005            U         V         W         X         Y