0

Currently I have a query in PostgreSQL that makes use of the ARRAY_AGG function in order to group together all distinct values for each column.

The query works fine however it can take up to 38 seconds to run which is not that efficient.

The query currently is like this:

WITH agged_data AS (
    SELECT birth_date, place_of_birth, first_name
    FROM user_info
)

SELECT (ARRAY_AGG(DISTINCT birth_date)), 
(ARRAY_AGG(DISTINCT place_of_birth)), 
(ARRAY_AGG(DISTINCT first_name)), 
FROM agged_data LIMIT 100

There are around >17,000 rows.

The query gets executed in Python backend (FastAPI) but I tested it out in PgAdmin as well. Originally I was running separate queries for each column (there are more columns in the table and the query I use, I just wrote the above as a MWE). But then thought it would put more strain on the db.

Is there an alternative to ARRAY_AGG for what I want to achieve?

mp252
  • 453
  • 1
  • 6
  • 18
  • Did you mean to put the `LIMIT 100` inside the `agged_data` query? – Bergi Mar 31 '23 at 11:55
  • No I tried that, but then it only got the distinct values of the first 100. – mp252 Mar 31 '23 at 11:56
  • 2
    Without a query plan, nobody knows why the query is slow. Could you please share the DDL for all tables involved, including the indexes, and the result from EXPLAIN(ANALYZE, VERBOSE, BUFFERS) for this query? (all in plain text) – Frank Heikens Mar 31 '23 at 12:09
  • @mp252 Then what *do* you want to limit to 100? `array_agg` always returns a single row anyway. – Bergi Mar 31 '23 at 12:16
  • @Bergi, ah yes this is true. Maybe I cannot do what I want to achieve then. I want to get all distinct values from each column and then limit those to 100, so in the `array_agg` there will be 100 values. – mp252 Mar 31 '23 at 12:27
  • 1
    @mp252 For that, you'd need three different subquery, [e.g. using the `ARRAY` constructor](https://stackoverflow.com/a/12499925/1048572): `SELECT ARRAY(SELECT DISTINCT birth_date FROM user_info LIMIT 100) AS birth_dates, ARRAY(…) AS places_of_birth, …;` – Bergi Mar 31 '23 at 12:32
  • @Bergi This worked! It also got it down to 8 seconds!! I had to `LIMIT 1` – mp252 Mar 31 '23 at 12:47

0 Answers0