1

I have a table whose columns are arrays of reals, like this

   my_column
-----------------
{5.7, 1.5, 1.7, ...}
{4.2, 4.1, 2.6, ...}
etc

and I wish to yield an aggregate array of sums where each element is the sum of the corresponding elements, down the rows -- something like this

   my_column_aggregate
--------------------------------------------------------
{sum(5.7,4.2, etc), sum(1.5,4.1,etc), sum(1.7,2.6,etc) ...}

My tables have several such columns and around 10 to 100 unordered rows, with each array having the same count (about 20) of ordered elements. Conceptually, a table is a matrix and of course I could have stored it using some other arrangement but for other reasons it'll probably remain that way.

I need to write a function, my_agg_sum(), say, so I can

SELECT my_agg_sum (my_column) FROM ... WHERE ...

After doing some research, I think I should ultimately install and use R http://www.r-project.org/ and PL/R http://www.joeconway.com/plr/ since I'll probably eventually have more complicated number crunching to do.

In the meantime, however, since my current needs are simple and I'd like to learn some basics of aggregate function writing, I need some help. I thought I could use the built-in function array_agg() to amass an array of arrays -- a matrix -- for which I could write my summation function. (Because I'm very new to PL/pgSQL, I'd use this http://ledgersmbdev.blogspot.ca/2013/04/introduction-to-postgresql-arrays-basic.html as an example guide.) If I can get an intermediate result

{{5.7, 1.5, 1.7, ...}, {4.2, 4.1, 2.6, ...}, etc}

I think I'm OK. The problem is, when I try

SELECT array_agg (my_column)

I get

ERROR:  could not find array type for data type real[]

It seems arrays of arrays are not directly supported, even though 2D arrays are!

I didn't really understand the responses at http://postgresql.1045698.n5.nabble.com/could-not-find-array-type-for-data-type-character-varying-td1879264.html

Maybe I need to create a new type

CREATE TYPE profile AS (a_name REAL[])

and then what? Or maybe I need to do a load of nested unnest()ing or array_to_string()ing...?

I could somehow iterate over each array element

SELECT  sum (my_column[i])

for i, 1 to n, but how?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Martin F
  • 590
  • 7
  • 28
  • possible duplicate of [Initial array in function to aggregate multi-dimensional array](http://stackoverflow.com/questions/9832973/initial-array-in-function-to-aggregate-multi-dimensional-array) – Erwin Brandstetter Sep 20 '13 at 19:13
  • More "prior art" [here](http://stackoverflow.com/questions/6782268/array-agg-for-array-types) and [here](http://stackoverflow.com/questions/12414750/is-there-something-like-a-zip-function-in-postgresql-that-combines-two-arrays). – Erwin Brandstetter Sep 20 '13 at 19:14
  • Thanks, Erwin. All those examples help my understanding. – Martin F Sep 22 '13 at 18:41

1 Answers1

2

You can write a pretty simple CREATE AGGREGATE statement for this; see this similar prior post.

It's not very efficient, so PL/R will be a much better option. Or writing an aggregate in C if you're brave - the PostgreSQL array API in C is pretty ugly.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778