1

I'm attempting to create a query such as

SELECT e.col1, e.col2 
FROM entity e
INNER JOIN (
    VALUES
    (1377776),(1377792),(1377793),(1377794),(1377795),(1377796)
) ex(ex_entityid) ON (entityid = ex_entityid)

Given I have an array containing the numbers 13777xx in an array, how would I generate this query?

I got this syntax from this question: https://stackoverflow.com/a/17824797/425544. My input list is large enough that using IN is too slow.

So far, using $0:csv, I can get my the values to be a comma separated list. I do not know how to format the values so that it has the parenthesis around it. If I give the array without the :csv tag, it adds "array" before the values, and then I got a syntax error.

Nikhil
  • 1,121
  • 2
  • 11
  • 27

2 Answers2

1

Given I have an array containing the numbers 13777xx in an array, how would I generate this query?

Arbitrary numbers

If you have an actual array, just pass the array literal, in the form of:

'{1377776,1377792,1377793}'

For small arrays with just a hand full of values, you can just use the ANY construct in the join:

SELECT col1, col2 
FROM   entity
WHERE  entityid = ANY ($my_array::int[]);

For string literal you'll need an explicit type cast as demonstrated.

For long arrays, it will be more efficient to unnest the array to a set (derived table) and then join like mu commented:

SELECT col1, col2
FROM   unnest(my_array::int[]) ex(entityid)
JOIN   entity USING (entityid);

With a matching column name for the unnested numbers you can conveniently use the short USING clause for the join condition.

Note a subtle difference: The first query implicitely folds duplicates in the input, while the second with the join produces duplicates for duplicates in the input. Your pick.

Alternatively you can also pass individual values to an IN expression. But that's typically slower. (Again: folds duplicates.) Like:

...
WHERE  entityid IN (1377776, 1377792, 1377793);

See:

For repeated operations you might create a VARIADIC function and pass individual numbers as parameters (or a single array literal, at your choosing). See:

Not so arbitrary numbers

If you actually want to base you query on "the numbers 13777xx", meaning all numbers between 1377700 and 1377799, use generate_series() instead. Like:

SELECT col1, col2 
FROM   generate_series(1377700, 1377799) entityid
JOIN   entity USING (entityid);

Or, simplest and fastest, adapt the WHERE clause with a single range predicate:

SELECT col1, col2 
FROM   entity
WHERE  entityid BETWEEN 1377700 AND 1377799;
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • I used the unnest. Thanks! – Nikhil Apr 05 '19 at 15:23
  • @Nikhil It is kind of odd that as the author of pg-promise, I gave you the precise answer to your question, but the one you accept as the answer that's a step-away from it? :) – vitaly-t Apr 05 '19 at 15:31
1

That's a non-standard format, so you need to make use of Custom Type Formatting:

const wrap = arr => ({
    rawType: true,
    toPostgres: () => arr.map(a => pgp.as.format('($1)', [a])).join()
});

Now you can use such a wrapped array as a simple formatting variable:

const data = [1377776, 1377792, 1377793];
db.any('... INNER JOIN (VALUES $1)', [wrap(data)]);
vitaly-t
  • 24,279
  • 15
  • 116
  • 138