1

I have an InfluxDB measurement which includes a field that holds either 0 or 1.

How do I find the longest unbroken run of a given value?

Imagine that the field represents whether the sun is up or not, and I have a year's worth of data. I would like the query which finds the longest unbroken run of 1's, which would represent the longest day of the year and return me something like "23rd June 5am to 23rd June 9pm". (I'm in the northern hemisphere, and totally made those times up, but hopefully you get the idea.)

dty
  • 18,795
  • 6
  • 56
  • 82

1 Answers1

1

I don't think this can be done with InfluxQL. In many RDBMS, it's possible to do similar operations in a single SQL query using window functions and grouping.

I've experimented a few ways, but as of v1.3 I believe InfluxQL is just not expressive enough for this task. Limitations include:

  1. No window functions (although some functions exhibit similar behaviour, e.g. DIFFERENCE, DERIVATIVE).
  2. time cannot be manipulated like an ordinary tag or field value. For example, it's not possible to take the FIRST(time) of a group of measurements.
  3. Can only GROUP BY time or tag, not by field value (or derived value from a subquery result). Additionally, when grouped by time, only group interval timestamps are returned by selector functions.
  4. Can only ORDER BY time.

The best way to do this is therefore at the application level.


Edit: for the record, the closest I can get is to use ELAPSED to find the longest gap(s) between subsequent 0 values. This might work for you if your data model is a specific shape and data comes in at regular intervals:

SELECT TOP(elapsed, N) AS elapsed FROM (SELECT ELAPSED(field) FROM measurement WHERE field != 1)

Returns e.g. for N = 1:

time  elapsed
----  -------
2000  500

However, there is no guarantee that there is a value of 1 in the gap. Use FIRST to retrieve the first measurement with field == 1 within the gap, or nothing if there are none:

SELECT FIRST(field) FROM measurement WHERE field = 1 AND time < 2000 and time > (2000 - 500)

Returns e.g.:

time  first
----  -----
1000  1

Therefore the longest run of 1 values is from 1000 -> 2000.

peterdn
  • 2,386
  • 1
  • 23
  • 24
  • Thanks @peterdn. I suspect you're right. My "regular" SQL isn't strong enough to do this in a traditional RDBMS, and I know InfluxDB (and time-series databases in general) have far more constraints. – dty Nov 28 '17 at 08:58
  • @dty I've edited with a potential solution that may work for you depending on shape and regularity of data. – peterdn Nov 28 '17 at 11:58