1

In Azure Stream Analytic (IoT Hub), no matter what time window I specify, the TumblingWindow function fails to compare time and won't return the window I need. I am trying to use the below SQL code block to return me a 2-second window, but the output includes all events. As there is no pivot function in Stream Analytic, I am using the method suggested by @Joe-Zhang

In this case, there is an IoT read event every 2 seconds, and I am expecting only one event to be returned -


with tempone as (
    
    select 
    cast(dataArr.ArrayValue.SourceTimestamp as datetime) as SourceTimestamp, 
    cast(valuesArr.ArrayValue.Address as bigint) as Address2, 
    max(cast(valuesArr.ArrayValue.Value as float)) as Value2

from iotinput i
cross apply GetArrayElements(i.Content) as contentArr
cross apply GetArrayElements(contentArr.ArrayValue.Data) as dataArr
cross apply GetArrayElements(dataArr.ArrayValue.[Values]) as valuesArr

WHERE cast(valuesArr.ArrayValue.Address as bigint) = 30002

GROUP BY cast(dataArr.ArrayValue.SourceTimestamp as datetime), 
    cast(valuesArr.ArrayValue.Address as bigint),
    TumblingWindow(second, 2)
),

temptwo AS ( 

    select 
    cast(dataArr.ArrayValue.SourceTimestamp as datetime) as SourceTimestamp, 
    cast(valuesArr.ArrayValue.Address as bigint) as Address3, 
    max(cast(valuesArr.ArrayValue.Value as float)) as Value3


from iotinput i
cross apply GetArrayElements(i.Content) as contentArr
cross apply GetArrayElements(contentArr.ArrayValue.Data) as dataArr
cross apply GetArrayElements(dataArr.ArrayValue.[Values]) as valuesArr

WHERE cast(valuesArr.ArrayValue.Address as bigint) = 30003

GROUP BY cast(dataArr.ArrayValue.SourceTimestamp as datetime), 
    cast(valuesArr.ArrayValue.Address as bigint),
    TumblingWindow(second, 2)
 )

select tempone.SourceTimestamp, tempone.Value2 as Temperature, temptwo.Value3 as Humidity from tempone
join temptwo on tempone.SourceTimestamp = temptwo.SourceTimestamp
and DATEDIFF(second,tempone, temptwo) BETWEEN 0 AND 2

Returned values -

enter image description here

Utkarsh Pal
  • 4,079
  • 1
  • 5
  • 14
Alerteye
  • 51
  • 10

1 Answers1

1

If you don't use TIMESTAMP BY, the time logic will be based on the ingestion time.

Here it looks like you expect time processing to be done on SourceTimestamp, but you don't TIMESTAMP BY on it.

Florian Eiden
  • 832
  • 5
  • 9
  • Hi, I use a tumblingWindow(Duration(hour, 1)) with a timestamp by on a date field of the message in azure event-hub topic and I store min(topic.date) and max(topic.date) in database. In most of cases it works but sometimes stored max(topic.date) - min(topic.date) exceed 1 hour (sometimes even overlap 2 days on hours). Any idea how could that occur ? – Gweltaz Niquel Sep 01 '21 at 07:16
  • Could you please create a question dedicated to that issue? With the query and data examples? Thanks! – Florian Eiden Sep 02 '21 at 21:04
  • 1
    I found the solution : In Azure Stream Analytics web interface in event ordering : set retry policy to drop – Gweltaz Niquel Sep 03 '21 at 09:24
  • 1
    Good catch! The late-event and out-of-order policies will change the timestamp (system property) if you select adjust, but won't change the field in the payload so you can keep you original data. – Florian Eiden Sep 03 '21 at 17:35