5

A web application can send to a function an array of arrays like

[
    [
        [1,2],
        [3,4]
    ],
    [
        [],
        [4,5,6]
    ]
]

The outer array length is n > 0. The middle arrays are of constant length, 2 in this example. And the inner arrays lengths are n >= 0.

I could string build it like this:

with t(a, b) as (
    values (1, 4), (2, 3), (1, 4), (7, 3), (7, 4)
)
select distinct a, b
from t
where
    (a = any(array[1,2]) or array_length(array[1,2],1) is null)
    and
    (b = any(array[3,4]) or array_length(array[3,4],1) is null)
    or
    (a = any(array[]::int[]) or array_length(array[]::int[],1) is null)
    and
    (b = any(array[4,5,6]) or array_length(array[4,5,6],1) is null)
;
 a | b 
---+---
 7 | 4
 1 | 4
 2 | 3

But I think I can do better like this

with t(a, b) as (
    values (1, 4), (2, 3), (1, 4), (7, 3), (7, 4)
), u as (
    select unnest(a)::text[] as a
    from (values
        (
            array[
                '{"{1,2}", "{3,4}"}',
                '{"{}", "{4,5,6}"}'
            ]::text[]
        )
    ) s(a)
), s as (
    select a[1]::int[] as a1, a[2]::int[] as a2
    from u
)
select distinct a, b
from
    t
    inner join
    s on
        (a = any(a1) or array_length(a1, 1) is null)
        and
        (b = any(a2) or array_length(a2, 1) is null)
;
 a | b 
---+---
 7 | 4
 2 | 3
 1 | 4

Notice that a text array was passed and then "casted" inside the function. That was necessary as Postgresql can only deal with arrays of matched dimensions and the passed inner arrays can vary in dimension. I could "fix" them before passing by adding some special value like zero to make them all the same length of the longest one but I think it is cleaner to deal with that inside the function.

Am I missing something? Is it the best approach?

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • Is the format of the 3-dimensional array wrapped in text format given or can you chose how to pass these values? And the function receiving these values is plpgsql? Postgres version? – Erwin Brandstetter Oct 09 '13 at 23:29
  • @Erwin I can chose how to pass the values. The front end is Python so very flexible. That format is just how I managed it to work. The function is now working in sql for simplicity but it can be plpgsql or plpython. Postgresql 9.3. – Clodoaldo Neto Oct 10 '13 at 00:05

1 Answers1

2

I like your second approach.

SELECT DISTINCT t.*
FROM   (VALUES (1, 4), (5, 1), (2, 3), (1, 4), (7, 3), (7, 4)) AS t(a, b)
JOIN   (
   SELECT arr[1]::int[] AS a1
         ,arr[2]::int[] AS b1
   FROM   (
      SELECT unnest(ARRAY['{"{1,2}", "{3,4}"}'
                         ,'{"{}"   , "{4,5,6}"}'
                         ,'{"{5}"  , "{}"}'    -- added element to 1st dimension
                         ])::text[] AS arr     -- 1d text array
      ) sub
   ) s ON (a = ANY(a1) OR a1 = '{}')
      AND (b = ANY(b1) OR b1 = '{}')
;

Suggesting only minor improvements:

  1. Subqueries instead of CTEs for slightly better performance.

  2. Simplified test for empty array: checking against literal '{}' instead of function call.

  3. One less subquery level for unwrapping the array.

Result:

a | b
--+---
2 | 3
7 | 4
1 | 4
5 | 1

For the casual reader: Wrapping the multi-dimensional array of integer is necessary, since Postgres demands that (quoting error message):

multidimensional arrays must have array expressions with matching dimensions

An alternate route would be with a 2-dimensional text array and unnest it using generate_subscripts():

WITH a(arr) AS (SELECT '{{"{1,2}", "{3,4}"}
                        ,{"{}", "{4,5,6}"}
                        ,{"{5}", "{}"}}'::text[]   -- 2d text array
             )
SELECT DISTINCT t.*
FROM  (VALUES (1, 4), (5, 1), (2, 3), (1, 4), (7, 3), (7, 4)) AS t(a, b)
JOIN  (
   SELECT arr[i][1]::int[] AS a1
         ,arr[i][2]::int[] AS b1
   FROM   a, generate_subscripts(a.arr, 1) i       -- using implicit LATERAL
   ) s ON (t.a = ANY(s.a1) OR s.a1 = '{}')
      AND (t.b = ANY(s.b1) OR s.b1 = '{}');

Might be faster, can you test?

In versions before 9.3 one would use an explicit CROSS JOIN instead of lateral cross joining.

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228