4

I have a table with two columns wind_speed and wind_direction. I want to have a custom aggregrate function that would return average wind_speed and wind_direction. wind_speed and wind_direction in combination defines a vector where wind_speed is the magnitude of the vector and wind_direction is the direction of the vector. avg_wind_direction function should return average wind_speed as magnitude and wind_direction as direction of the average vector.

SELECT avg_wind_direction(wind_speed, wind_direction)
FROM sometable
GROUP BY location;

Related question: Custom PostgreSQL aggregate for circular average

user
  • 5,335
  • 7
  • 47
  • 63
  • 2
    So what exactly is that aggregate expected to calculate? Please **[edit]** your question (by clicking on the [edit] link below it) and add some [sample data](https://meta.stackexchange.com/questions/81852) and the expected output based on that data. [Formatted text](https://meta.stackoverflow.com/a/251362) please, [no screen shots](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557). –  Feb 22 '19 at 15:03
  • 1
    Why not use normal `avg` twice? – Laurenz Albe Feb 22 '19 at 15:13
  • Sorry I forgot to mention it is vector averaging because wind speed and wind direction in combination is a vector where speed is the magnitude of the vector and direction is the direction of vector in degrees. Normal averaging would not work. – user Feb 22 '19 at 15:16
  • First you need to define how to calculate the average direction. For example if you have 2 readings, one each of 90 degrees and 270 degrees, then a 'default' calculation of the average will come up with 180 degrees, which might not be what you want. For speed, avg(windspeed) seems sensible. – simon at rcl Feb 22 '19 at 15:47
  • 1
    maybe look at this extension - https://pgxn.org/dist/aggs_for_vecs/ - should calculate average and some other aggregations over vector – JosMac Feb 22 '19 at 15:48
  • 1
    Please start by providing your version of Postgres, a table definition, some sample values and the expected result. And does it have to be a *single* aggregate function? (Why?) Or can you combine multiple aggregate functions? – Erwin Brandstetter Feb 22 '19 at 16:09
  • I've been able to do it. Please see my answer below. Thanks everyone! – user Feb 22 '19 at 18:28

2 Answers2

4

First sorry if i break any posting rules here, first time poster and all that.

Wanted to use the above answer together with the timescaledb addition to postgres for my diy weather station but it turns out that the function is not parallel safe. Also afik the use of atan does not yield the correct answer.

So this is my modified version that i think should be parallel safe and uses atan2 instead.

DROP AGGREGATE IF EXISTS vector_avg(float, float) CASCADE;
DROP TYPE IF EXISTS vector_sum CASCADE;
DROP TYPE IF EXISTS avg_vector CASCADE;

CREATE TYPE vector_sum AS (x float, y float, count int);
CREATE TYPE avg_vector AS (magnitude float, direction float);

CREATE OR REPLACE FUNCTION sum_vector (vectors vector_sum, magnitude float, direction float)
  RETURNS vector_sum LANGUAGE sql PARALLEL SAFE STRICT AS
'SELECT vectors.x + (magnitude * cos(radians(direction))), vectors.y + (magnitude * sin(radians(direction))), vectors.count + 1';

CREATE OR REPLACE FUNCTION combine_sum (part1 vector_sum , part2 vector_sum)
  RETURNS vector_sum LANGUAGE sql PARALLEL SAFE STRICT AS
'SELECT (part1.x+part2.x)/2,(part1.y+part2.y)/2,part1.count+part2.count';

CREATE OR REPLACE FUNCTION avg_vector_finalfunc(vectors vector_sum)
RETURNS avg_vector
AS
$$
DECLARE
        x float;
        y float;
        d float;
BEGIN
    BEGIN
        IF vectors.count = 0 THEN
            RETURN (NULL, NULL)::avg_vector;
        END IF;

        x := (vectors.x/vectors.count);
        y := (vectors.y/vectors.count);

        -- This means the vector is null vector
        -- Please see: https://math.stackexchange.com/a/3682/10842
        IF x = 0 OR y = 0 THEN
            RETURN (0, 0)::avg_vector;
        END IF;

         d:=degrees(atan2(y,x));

        -- atan2 returns negative result for angles > 180

        IF d < 0 THEN
          d := d+360;
        END IF;

        RETURN (sqrt(power(x, 2) + power(y, 2)), d )::avg_vector;
    EXCEPTION WHEN others THEN
        RETURN (NULL, NULL)::avg_vector;
    END;
END;
$$
LANGUAGE 'plpgsql'
PARALLEL SAFE
RETURNS NULL ON NULL INPUT;

CREATE AGGREGATE vector_avg (float, float) (
   sfunc     = sum_vector
 , stype     = vector_sum
 , combinefunc = combine_sum
 , finalfunc = avg_vector_finalfunc
 , initcond  = '(0.0, 0.0, 0)'
 , PARALLEL  = SAFE

Test from a very small sample:

psql -d weather -c "select * from windavgtest;"
             time              | direction | speed 
-------------------------------+-----------+-------
 2019-08-01 16:51:53.199357+00 |       170 |     1
 2019-08-01 16:51:54.388392+00 |       170 |     1
 2019-08-01 16:51:55.335034+00 |       170 |     1
 2019-08-01 16:51:56.362812+00 |       170 |     1
 2019-08-01 16:52:07.191919+00 |       190 |     1
 2019-08-01 16:52:08.250756+00 |       190 |     1
 2019-08-01 16:52:09.193265+00 |       190 |     1
 2019-08-01 16:52:10.224283+00 |       190 |     1
(8 rows)

yields:

psql -d weather -c  "select round((vector_avg(speed, direction)).direction) AS wdirection from windavgtest;
"
 wdirection 
------------
        180
(1 row)
kill
  • 51
  • 2
  • Yes, you are right. There were some bugs. I've fixed those. Please see my updated answer. – user Aug 01 '19 at 17:33
2

So I have been able to create an aggregrate function that does the vector averaging. It makes the assumption that the vector is in polar co-ordinates and the angle is in degrees, as opposed to radian.

DROP AGGREGATE IF EXISTS vector_avg(float, float) CASCADE;
DROP TYPE IF EXISTS vector_sum CASCADE;
DROP TYPE IF EXISTS avg_vector CASCADE;

CREATE TYPE vector_sum AS (x float, y float, count int);
CREATE TYPE avg_vector AS (magnitude float, direction float);

CREATE OR REPLACE FUNCTION sum_vector (vectors vector_sum, magnitude float, direction float)
  RETURNS vector_sum LANGUAGE sql STRICT AS
'SELECT vectors.x + (magnitude * cos(direction * (pi() / 180))), vectors.y + (magnitude * sin(direction  * (pi() / 180))), vectors.count + 1';
CREATE OR REPLACE FUNCTION avg_vector_finalfunc(vectors vector_sum) RETURNS avg_vector AS
$$
DECLARE
        x float;
        y float;
        maybe_neg_angle numeric;
        angle numeric;

        v_state   TEXT;
        v_msg     TEXT;
        v_detail  TEXT;
        v_hint    TEXT;
        v_context TEXT;
BEGIN
    BEGIN
        IF vectors.count = 0 THEN
            RETURN (NULL, NULL)::avg_vector;
        END IF;

        x := (vectors.x/vectors.count); 
        y := (vectors.y/vectors.count);

        -- This means the vector is null vector
        -- Please see: https://math.stackexchange.com/a/3682/10842
        IF x = 0 OR y = 0 THEN
            RAISE NOTICE 'X or Y component is 0. Returning NULL vector';
            RETURN (0.0, 0.0)::avg_vector;
        END IF;

        maybe_neg_angle := atan2(CAST(y AS NUMERIC), CAST(x AS NUMERIC)) * (180.0 / pi());
        angle := MOD(CAST((maybe_neg_angle + 360.0) AS NUMERIC), CAST(360.0 AS NUMERIC));

        RETURN (sqrt(power(x, 2) + power(y, 2)), angle)::avg_vector;

    EXCEPTION WHEN others THEN
        RAISE NOTICE 'Exception was raised. Returning just NULL';
        RETURN (NULL, NULL)::avg_vector;
    END;
END;
$$
LANGUAGE 'plpgsql'
RETURNS NULL ON NULL INPUT;


CREATE AGGREGATE vector_avg (float, float) (
   sfunc     = sum_vector
 , stype     = vector_sum
 , finalfunc = avg_vector_finalfunc
 , initcond  = '(0.0, 0.0, 0)'
);

Test:

DROP TABLE t;
CREATE TEMP TABLE t(speed float, direction float);
INSERT INTO t VALUES (23, 334), (20, 3), (340, 67);

Test:

SELECT (vector_avg(speed, direction)).magnitude AS speed, (vector_avg(speed, direction)).direction AS direction FROM t;

Result:

+-----------------+-------------------+
| speed           | direction         |
+=================+===================+
| 108.44241888507 | 0.972468335643555 |
+-----------------+-------------------+

Removing all the rows:

DELETE FROM t;
SELECT (vector_avg(speed, direction)).magnitude AS speed, (vector_avg(speed, direction)).direction AS direction FROM t;

Result:

+---------+-------------+
| speed   | direction   |
+=========+=============+
| <null>  | <null>      |
+---------+-------------+
user
  • 5,335
  • 7
  • 47
  • 63