How to summarize time window based on a status in Kusto

Question

I have recently started working with Kusto. I am stuck with a use case where i need to confirm the approach i am taking is right.

I have data in the following format

In the above example, if the status is 1 and if the time frame is equal to 15 seconds then i need to assume it as 1 occurrence.

So in this case 2 occurrence of status. My approach was

if the current and next rows status is equal to 1 then take the time difference and do row_cum_sum and break it if the next(STATUS)!=0.

Even though the approach is giving me correct output, I am assuming the performance can slow down once the size is increased.

I am looking for an alternative approach if any. Also adding the complete scenario to reproduce this with a sample data.

.create-or-alter function with (folder = "Tests", skipvalidation = "true") InsertFakeTrue() {
range LoopTime from ago(365d) to now() step 6s 
| project TIME=LoopTime,STATUS=toint(1)
}

.create-or-alter function with (folder = "Tests", skipvalidation = "true") InsertFakeFalse() {
range LoopTime from ago(365d) to now() step 29s 
| project TIME=LoopTime,STATUS=toint(0)
}


.set-or-append FAKEDATA <| InsertFakeTrue();

.set-or-append FAKEDATA <| InsertFakeFalse();



FAKEDATA
| order by TIME asc
| serialize 
| extend cstatus=STATUS
| extend nstatus=next(STATUS)
| extend WindowRowSum=row_cumsum(iff(nstatus ==1 and cstatus ==1, datetime_diff('second',next(TIME),TIME),0),cstatus !=1)
| extend windowCount=iff(nstatus !=1 or isnull(next(TIME)), iff(WindowRowSum ==15, 1,iff(WindowRowSum >15,(WindowRowSum/15)+((WindowRowSum%15)/15),0)),0 ) 
| summarize  IDLE_COUNT=sum(windowCount)

Your approach is the way to do this kind of calculation in Kusto. It is relatively efficient (considering the logic requires sorting the data) and should work as long as the sorted data can reside in a single machine. — RoyO, Jun 09 '20 at 09:03
@RoyO Thanks. I have one more doubt. If i do union of functions , lets say UNION findTotalFailedEventsById(1),findTotalSuccessById(1) -> will this be called sequentially one after other or parallel call and then union? — Justin Mathew, Jun 09 '20 at 09:46

score 3 · Accepted Answer · answered Jun 09 '20 at 21:37

The approach in the question is the way to achieve such calculations in Kusto and given that the logic requires sorting is also efficient (as long as the sorted data can reside on a single machine).
Regarding union operator - it runs in parallel by default, you can control the concurrency and spread using hints, see: union operator

How to summarize time window based on a status in Kusto

1 Answers1