-1

I don't know how to calculate the average age of a column of type date in SQL Server.

Dale K
  • 25,246
  • 15
  • 42
  • 71
  • 1
    You might calculate age as the difference between now and the column value. Then run the result through the AVG function. Here's how to get age in years: https://stackoverflow.com/questions/10506731/get-difference-in-years-between-two-dates-in-mysql-as-an-integer/23824981 – Honeyboy Wilson Jun 04 '20 at 19:26

2 Answers2

0

You can use datediff() and aggregation. Assuming that your date column is called dt in table mytable, and that you want the average age in years over the whole table, then you would do:

select avg(datediff(year, dt, getdate())) avg_age
from mytable

You can change the first argument to datediff() (which is called the date part), to any other supported value depending on what you actually mean by age; for example datediff(day, dt, getdate()) gives you the difference in days.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • I'm thinking there's going to be an issue with that. If (for example) the DT is 2019-12-31 of any time and the current date is 2020-01-01 of any time, even if the actual difference between the two dates is as little as 3ms, that's going to report an age of 1 year, which is grossly incorrect. DATEDIFF is nothing like TimeStampDiff in other languages. DATEDIFF counts only the boundaries it crosses even if they only 3ms apart. – Jeff Moden Jun 05 '20 at 01:13
0

First, lets calculate the age in years correctly. See the comments in the code with the understanding that DATEDIFF does NOT calculate age. It only calculates the number of temporal boundaries that it crosses.

--===== Local obviously named variables defined and assigned
DECLARE  @StartDT DATETIME = '2019-12-31 23:59:59.997'
        ,@EndDT   DATETIME = '2020-01-01 00:00:00.000'
;
--===== Show the difference in milliseconds between the two date/times
     -- Because of the rounding that DATETIME does on 3.3ms resolution, this will return 4ms,
     -- which certainly does NOT depict an age of 1 year.
 SELECT DATEDIFF(ms,@StartDT,@EndDT)
;
--===== This solution will mistakenly return an age of 1 year for the dates given,
     -- which are only about 4ms apart according the SELECT above.
 SELECT IncorrectAgeInYears = DATEDIFF(YEAR, @StartDT, @EndDT)
;
--===== This calulates the age in years correctly in T-SQL.
     -- If the anniversary data has not yet occurred, 1 year is substracted.
 SELECT CorrectAgeInYears = DATEDIFF(yy, @StartDT, @EndDT) 
                          - IIF(DATEADD(yy, DATEDIFF(yy, @StartDT, @EndDT), @StartDT) > @EndDT, 1, 0)
;

Now, lets turn that correct calculation into a Table Valued Function that returns a single scalar value producing a really high speed "Inline Scalar Function".

 CREATE FUNCTION [dbo].[AgeInYears]
        (
        @StartDT DATETIME, --Date of birth or date of manufacture or start date.
        @EndDT   DATETIME  --Usually, GETDATE() or CURRENT_TIMESTAMP but
                           --can be any date source like a column that has an end date.
        )
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
 SELECT AgeInYears = DATEDIFF(yy, @StartDT, @EndDT) 
                   - IIF(DATEADD(yy, DATEDIFF(yy, @StartDT, @EndDT), @StartDT) > @EndDT, 1, 0)
;

Then, to Dale's point, let's create a test table and populate it. This one is a little overkill for this problem but it's also useful for a lot of different examples. Don't let the million rows scare you... this runs in just over 2 seconds on my laptop including the Clustered Index creation.

--===== Create and populate a large test table on-the-fly.
     -- "SomeInt" has a range of 1 to 50,000 numbers
     -- "SomeLetters2" has a range of "AA" to "ZZ" 
     -- "SomeDecimal has a range of 10.00 to 100.00 numbers
     -- "SomeDate" has a range of >=01/01/2000 & <01/01/2020 whole dates
     -- "SomeDateTime" has a range of >=01/01/2000 & <01/01/2020 Date/Times
     -- "SomeRand" contains the value of RAND just to show it can be done without a loop.
     -- "SomeHex9" contains 9 hex digits from NEWID()
     -- "SomeFluff" is a fixed width CHAR column just to give the table a little bulk.
 SELECT TOP 1000000
         SomeInt        = ABS(CHECKSUM(NEWID())%50000) + 1
        ,SomeLetters2   = CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
                        + CHAR(ABS(CHECKSUM(NEWID())%26) + 65)
        ,SomeDecimal    = CAST(RAND(CHECKSUM(NEWID())) * 90 + 10 AS DECIMAL(9,2))
        ,SomeDate       = DATEADD(dd, ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'2000','2020')), '2000')
        ,SomeDateTime   = DATEADD(dd, DATEDIFF(dd,0,'2000'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020'))
        ,SomeRand       = RAND(CHECKSUM(NEWID()))  --CHECKSUM produces an INT and is MUCH faster than conversion to VARBINARY.
        ,SomeHex9       = RIGHT(NEWID(),9)
        ,SomeFluff      = CONVERT(CHAR(170),'170 CHARACTERS RESERVED') --Just to add a little bulk to the table.
   INTO dbo.JBMTest
   FROM      sys.all_columns ac1 --Cross Join forms up to a 16 million rows
  CROSS JOIN sys.all_columns ac2 --Pseudo Cursor
;
GO
--===== Add a non-unique Clustered Index to SomeDateTime for this demo.
 CREATE CLUSTERED INDEX IXC_Test ON dbo.JBMTest (SomeDateTime ASC)
;

Now, lets find the average age of those million represented by the SomeDateTime column.

 SELECT  AvgAgeInYears = AVG(age.AgeInYears )
        ,RowsCounted   = COUNT(*)
   FROM dbo.JBMTest tst
  CROSS APPLY dbo.AgeInYears(SomeDateTime,GETDATE()) age
;

Results:

enter image description here

Jeff Moden
  • 3,271
  • 2
  • 27
  • 23