I am trying to calculate age-range based on Customer's date-of-birth. Data for this source comes into staging, where I call this following stored procedure. And a batch size generally is 500K to 2M here.
CREATE PROCEDURE [spCustomer_UpdateStageAgeRange]
AS
BEGIN
DECLARE @Today date = getdate()
update stg.tCustomer_Merge set [CALC_Age_Range] =
CASE
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 13 THEN '<13'
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 18 THEN '13-17'
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 22 THEN '18-21'
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 26 THEN '22-25'
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 34 THEN '26-33'
WHEN datediff(Year, CAST([DOB] AS date), @Today) < 45 THEN '34-44'
WHEN datediff(Year, CAST([DOB] AS date), @Today) >= 45 THEN '45+'
ELSE 'Unknown'
END
End
Problem : Even for some valid DOB, this stored procedure calculates age-range as 'Unknown'. We are not able to re-generate this for smaller data set, so is this a big volume problem? Or any other reason?
Context: In 28M records, 230K are having 'Unknown'
Note: We are using Azure-sql-server. DOB is a date field in table stg.tCustomer_Merge. Manual runs of a DOB works, it's only in large batches we see this issue.