1

Basically I just want a simple way of finding the most recent date in a table, saving it as a variable, and reusing that variable in the same query.

Right now this is how I'm doing it:

with recent_date as (
  select max(date)
  from mytable
)

select *
from mytable
where date = (select * from recent_date)

(For this simple example, a variable is overkill, but in my real-world use-case I reuse the recent date multiple times in the same query.)

But that feels cumbersome. It would be a lot cleaner to save the recent date to a variable rather than a table and having to select from it.

In pseudo-code, something like this would be nice:

$recent_date = (select max(date) from mytable)

select *
from mytable
where date = $recent_date

Is there something like that in Postgres?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Alec Mather
  • 742
  • 5
  • 20

2 Answers2

1

Better for the simple case

For the scope of a single query, CTEs are a good tool. In my hands the query would look like this:

WITH recent(date) AS (SELECT max(date) FROM mytable)
SELECT m.*
FROM   recent r
JOIN   mytable m USING (date)

Except that the actual example query would burn down to this in my hands:

SELECT *
FROM   mytable
ORDER  BY date DESC NULLS LAST
FETCH  FIRST 1 ROWS WITH TIES;

NULLS LAST only if there can be NULL values. See:

WITH TIES only if date isn't UNIQUE NOT NULL. See:

In combination with an index on mytable (date) (or more specific), this produces the best possible query plan. Look no further.

No, I need variables!

If you positively need variables scoped for the same command, transaction, session or more, there are various options.

The closest thing to "variables" in SQL in Postgres are "customized options". See:

You can only store text, any other type has to be cast (and cast back on retrieval).
To set and retrieve a value from within a query, use the Configuration Settings Functions set_config() and current_setting():

SELECT set_config('foo.recent', max(date)::text, false) FROM mytable;

SELECT *
FROM   mytable
WHERE  date = current_setting('foo.recent')::date;

Typically, there are more efficient ways.

If you need that "recent date" a lot, consider a simple function as "global variable", usable by all transactions in all sessions (but each new command sees its own current state):

CREATE FUNCTION f_recent_date()
  RETURNS date
  LANGUAGE sql STABLE PARALLEL SAFE AS
'SELECT max(date) FROM mytable';

STABLE is a valid volatility setting as the function returns the same result within the same query. Be sure to actually make it STABLE, so Postgres does not evaluate repeatedly. In Postgres 9.6 or later, also make it PARALLEL SAFE. Then your query becomes:

SELECT * FROM mytable WHERE date = f_recent_date();

More options:

Typically, if I need variables in Postgres, I use a PL/pgSQL code block in a function, a procedure, or a DO statement for ad-hoc use without the need to return rows:

DO
$do$
DECLARE
   _recent_date date := (SELECT max(date) FROM mytable);
BEGIN
   PERFORM * FROM mytable WHERE date = _recent_date;
   -- more queries using _recent_date ...
END
$do$;

PL/pgSQL may be what you should be using to begin with. Further reading:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Ok not to put down the amount of effort in this post but... the only part where you almostttt answer my question is the code block in the very end. I am specifically looking for something with the pseudo-syntax of creating a variable, and then using it. TBH I have no idea what all these PL/pgSQL, function, procedure, DO statement things have to do with my question specifically, but if it is relevant, can you say why? – Alec Mather May 13 '22 at 00:38
  • In other words, could you answer the question of creating a variable in psql with as minimal code possible? – Alec Mather May 13 '22 at 00:39
  • You did not mention psql before (the default PostgreSQL interactive terminal). There you can use variables freely. See: https://www.postgresql.org/docs/current/app-psql.html#APP-PSQL-INTERPOLATION – Erwin Brandstetter May 13 '22 at 00:51
  • Oh wow I always thought psql and Postgresql were 100% interchangeable, I didn't realize psql referred to the terminal implementation. My bad, I mean PostgreSQL – Alec Mather May 13 '22 at 00:54
  • Yes, psql <> Postgres. Seriously, though, I get the feeling you are trying to force a programming style that's fit for procedural languages onto the *declarative* language SQL. There is a reason that SQL has no "variables" (only second-hand workarounds). All the options I mentioned are on topic. You are not the first to ask for "variables". Follow the many links I provided. – Erwin Brandstetter May 13 '22 at 00:57
  • I'm not trying to "force it" so much as I'm just trying to understand if it's a "thing" that you can do. If not, I'm fine accepting that. I'm just still confused by your answer. At the end you talk about variables and how you can create them, but you add additional context (PL/pgSQL, functions, procedures..etc), I'm just asking you to explain why that's necessary (if at all) to create a variable in the context of the question. And if it isn't required, then it probably shouldn't be part of this answer right? Sorry if this feels combative, I'm just trying to narrow in on the question <3 – Alec Mather May 13 '22 at 01:32
  • I speak of PL/pgSQL, because that's what you should use when you need variables. SQL does not have variables. Only second-hand workarounds. I spelled out "customized options" for you and added another link (also shown in other linked answers). Now go in peace, you stubborn variable enthusiast! – Erwin Brandstetter May 13 '22 at 03:11
0

Keep in mind that in SQL you cannot directly declare a variable. Basically a CTE is creating variable (or a set of) and in SQL to use a variable you select it. However, if you want to avoid that structure you can just get the variable directl from a subset directly.

select *
  from mytable
 where date = (select max(date) from mytable);
Belayer
  • 13,578
  • 2
  • 11
  • 22