1

Teradata: I have Start_Timestamp and End_Timestamp and I need to find the average Avg_Duration for every ID.

Dataset looks like this:

--------------------------------------------------------------------
|ID     | Start_Timestamp             |End_Timestamp               |
--------------------------------------------------------------------
|111    |2021-08-25 19:37:51.327000   |2021-08-26 16:25:51.129600  |
--------------------------------------------------------------------
|111    |2021-07-16 06:17:23.124000   |2021-07-19 13:16:53.185350  |
--------------------------------------------------------------------
|111    |2021-06-22 10:11:21.754400   |2021-02-25 18:48:13.614650  |
--------------------------------------------------------------------

I ran the following code:

SELECT ID, AVG((End_Timestamp - Start_Timestamp) DAY ) AS Avg_Duration
FROM Table_A
Group By ID;

The result looks like this:

-----------------------
|ID     | Aug_Duration|
-----------------------
|111    |            2|
-----------------------

I expected Avg_Duration to be (1+3+3)/3=2.33.

I am aware that integer division gets truncated. So, I CAST the Timestamp difference before I took average to convert Integer into Decimal. My code for that is:

SELECT ID, AVG(CAST((End_Timestamp - Start_Timestamp) DAY AS DECIMAL(2,2))) AS Avg_Duration
FROM Table_A
Group By ID;

I expected 2.33, but, now TeraData window is unresponsive/idle. It neither runs nor I get any error message.

Can someone tell me where am I making a mistake and how can I get Avg_Duration with 2 decimal places and not a truncated one?

cph_sto
  • 7,189
  • 12
  • 42
  • 78
  • 1
    `decimal(2.2)` is the wrong data type, both digits are fractional. But this should result in an *overflow* error message. Are you sure it actually executes, can you find it in the history? `AVG(CAST((End_Timestamp - Start_Timestamp DAY(4)) AS DECIMAL(6,2)))` should work – dnoeth Feb 28 '22 at 12:39
  • Sorry, das war ein Tippfehler. – cph_sto Feb 28 '22 at 13:10
  • Many thanks for your answer. It seems to work. I have a question - Why we do `DAY (4)`? What does 4 mean here? (`DAY AS DECIMAL(6,2)` idea I got from your old answer https://stackoverflow.com/questions/38734036/teradata-keeps-returning-zeros-for-decimals – cph_sto Feb 28 '22 at 13:18
  • 1
    The default precision for DAY is 2, switching to the maximum prevents overflow error (of course, you still get one if it's over 9999 days) – dnoeth Feb 28 '22 at 16:30
  • 1
    Btw, you can also get your expected result using `Avg(Cast(End_Timestamp AS DATE) - Cast(Start_Timestamp AS DATE))`, which will not run in any overflow – dnoeth Mar 01 '22 at 07:36
  • Thank you so much. This snippet of code works too. Now I understand that simple subtracting was not working because it was a timestamp and not a date. Could you kindly, put these two snippetrs of code in Answer, so that I can accept them? – cph_sto Mar 02 '22 at 07:27

1 Answers1

1

You had a ) in the wrong place. But I doubt it caused an issue on Teradata, probably you just didn't notice that the client (Studio?) simply didn't submit it as it was invalid.

This works as expected - after changing DEC(2,2) (two significant digits) to DEC(6,2)(6 significant digits) to avoid an overflow error:

AVG(CAST((End_Timestamp - Start_Timestamp DAY(4)) AS DECIMAL(6,2)))

Even with the maximum DAY(4) you might still get an Interval Overflow error, but your logic can also be calculated using dates:

Avg(Cast(End_Timestamp AS DATE) - Cast(Start_Timestamp AS DATE))

Finally, as it's based on Timestamps you might want more detailed Intervals like

Avg(End_Timestamp - Start_Timestamp DAY(4) TO SECOND(0))
Avg(End_Timestamp - Start_Timestamp DAY(4) TO MINUTE)
dnoeth
  • 59,503
  • 4
  • 39
  • 56