1

I am trying to collect some average survey scores on employees by location over time. These people can transfer locations over time for various reasons. I have 2 tables.

Survey_Scores: this table houses individual survey scores. It contains 3 million rows.

Sample:

Survey_Date Employee_Number Score (1-4)
1/2/2023 1 3
1/5/2023 2 4

Employee_Info: this table houses info about the employees. This info can change over time and this history is kept by having multiple lines for one person in this table. Their rows can be distinguished by the columns record start and record end dates. When the record end date is null, that is their most current information.

Sample showing 2 employees who traded locations on January 4th:

Employee_Number Location Record_Start_Date Record_End_Date
1 SLC 1/1/2023 1/3/2023
2 TEX 1/1/2023 1/3/2023
1 TEX 1/4/2023 (null)
2 SLC 1/4/2023 (null)

PROBLEM: I need to create a view that starts as the Survey_Scores table, but with an added column that denotes the location the employee belonged to as of the Survey_Date.

I tried joining the tables on the employee_Number columns, but got an error in snowflake due to the many to many relationship here. I understand why it isn't working, because it doesn't know which row to look at, so I'm wondering if there is some workaround where I can make it return the location of where that person was on the survey_date.

My initial thought beyond what i have already done is to somehow work in an if statement that says: if the survey_date is between record_start_date and record_end_date, then return location. But I'm not quite sure how to get going on implementing that.

I am still new to SQL and appreciate any help, thanks.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • try this: `SELECT a.*, b.location FROM Survey_Scores AS a LEFT JOIN Employee_Info AS b ON a.Employee_Number = b.Employee_Number WHERE a.Survey_Date BETWEEN b.Record_Start_Date AND b.Record_End_Date;` – Manoj Kumar Jun 25 '23 at 04:28
  • This was pretty close, the answer i got from Lukasz Szozda worked and is essentially the same code but with null handling. – Logan Nielsen Jun 25 '23 at 18:49
  • @LoganNielsen There is substantial difference when the condition is on `ON` clause vs. `WHERE` in context of LEFT OUTER join – Lukasz Szozda Jun 25 '23 at 18:52

1 Answers1

0

I tried joining the tables on the employee_Number fields but got an error in snowflake due to the many to many relationship here

SQL allows to join on more than one field/predicate:

SELECT s.*, e.Location
FROM Survey_Scores AS s
LEFT JOIN Employee_Info AS e
  ON s.Employee_Number = e.Employee_Number
 AND s.Survey_Date BETWEEN e.Record_Start_Date 
                       AND COALESCE(e.Record_End_Date, '2999-12-31'::DATE);
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275