2

I'm trying to implement a custom aggregate function in Postgres which will average directions in degrees - i.e. I want to able to do:

SELECT circavg(direction) FROM sometable;

This can be done using the formula:

xbar = atan2(sum(sin(xi), sum(cos(xi)))

I think I need to define an sfunc which will take a direction, and add the sine and cosine of that into two accumulators. The final function then converts the two components back into a direction using atan2.

I can't work out how to define the sfunc so that the current state consists of two components e.g. (float, float). The documentation is a bit short on concrete examples, so any help is appreciated.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
samwise
  • 129
  • 7
  • Welcome to SO! Your described `circavg` is taking single parameter, while you've specified 2 parameters in `xbar` sample. Also, you miss closing parenthesis in your example. – vyegorov Apr 19 '12 at 09:52
  • 1
    @vyegorov: the xbar example does have only a single paramter (`xi`) It's used in two places though. –  Apr 19 '12 at 11:30

2 Answers2

4

You can make use of an ARRAY type internally. Argument type can still be any numeric type. Demonstrating with float (= double precision):

CREATE OR REPLACE FUNCTION f_circavg (float[], float)
  RETURNS float[] LANGUAGE sql STRICT AS
'SELECT ARRAY[$1[1] + sin($2), $1[2] + cos($2), 1]';

CREATE OR REPLACE FUNCTION f_circavg_final (float[])
  RETURNS float  LANGUAGE sql AS
'SELECT CASE WHEN $1[3] > 0 THEN atan2($1[1], $1[2]) END';

CREATE AGGREGATE circavg (float) (
   sfunc     = f_circavg
 , stype     = float[]
 , finalfunc = f_circavg_final
 , initcond  = '{0,0,0}'
);

The transition function f_circavg() is defined STRICT, so it ignores rows with NULL input. It also sets a third array element to identify sets with one or more input rows - else the CASE the final function returns NULL.

Temporary table for testing:

CREATE TEMP TABLE t (x float);
INSERT INTO t VALUES (2), (NULL), (3), (4), (5);

I threw in a NULL value to also test the STRICT magic. Call:

SELECT circavg(x) FROM t;

       circavg
-------------------
 -2.78318530717959

Cross check:

SELECT atan2(sum(sin(x)), sum(cos(x))) FROM t;

       atan2
-------------------
 -2.78318530717959

Returns the same. Seems to work. In test with a bigger table the last expression with regular aggregate functions was 4x faster than the custom aggregate.

Test for zero input rows / only NULL input:

SELECT circavg(x) FROM t WHERE false;     -- no input rows
SELECT circavg(x) FROM t WHERE x IS NULL; -- only NULL input

Returns NULL in both cases.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    It returns `0` if there's no rows. It should return `NULL` instead, – user Feb 20 '19 at 16:47
  • @user: I improved the custom aggregate function to work for no rows and / or NULL value input, too. (My favorite is the faster solution with regular aggregate functions.) – Erwin Brandstetter Feb 21 '19 at 01:52
  • Thanks a lot. I have posted a related question. It'd be great if you can take a look. https://stackoverflow.com/q/54829781/512251 – user Feb 22 '19 at 14:57
0

PostgreSQL provides you with a set of geometry types, POINT being the fundamental one.
Use this type as the input parameter for your function.

You can create your custom type as an alternative if you prefer.

vyegorov
  • 21,787
  • 7
  • 59
  • 73