2

I have the following widgets table in Postgres:

enter image description here

(That screenshot is a crude Excel representation of it.) I am trying to write a single SQL query that will return me:

  1. The widgets record representing the widget with today's earliest load time; or
  2. If no widgets were loaded today, the one with the earliest load time ever (for the whole table)

So, using the image above:

  • The query would first try to return the widget which was loaded first, today (if such a widget exists). In this case, only widgets with an id of 3 and 5003094 (respectively) were loaded today. Of these two, widget_id = 3 was loaded earlier than the other, so this is the record that the query would return.
  • However, if we pretend that those two widgets were not in the table, and furthermore, that no widgets were loaded today, then the query would return widget_id = 1, because it was loaded back in 2010.

Here's my initial attempt at the query:

SELECT
    MIN(w.loaded_date_time)
FROM
    widgets w
WHERE
    w.loaded_date_time >= now()
    OR
    1=1

However, I know right off the bat that this is not syntactically correct. Any ideas? Thanks in advance!

IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756

4 Answers4

3

One row total

SELECT *
FROM   widgets
ORDER  BY loaded_date_time < now()::date, loaded_date_time
LIMIT  1;

.. effectively sorting timestamps of today (+ non-existent future) first, because boolean expressions are sorted FALSE -> TRUE -> NULL.

If future dates are possible:

ORDER  BY
       (loaded_date_time::date = now()::date) DESC NULLS LAST
      ,loaded_date_time

NULLS LAST is only relevant if loaded_date_time can be NULL, which should be disallowed to begin with. Drop the clause in this case.

One row per widget

SELECT DISTINCT ON (widget_id)
       widget_id, loaded_date_time
FROM   widgets
ORDER  BY
       widget_id
      ,(loaded_date_time::date = now()::date) DESC
      ,loaded_date_time;

Why and how does this work?

  • About DISTINCT ON:
  • In the ORDER BY clause ...
    ... widget_id goes first, obviously - has to match the DISTINCT clause.
    ... then sort records of "today" first. If there are none, other records move up automatically.
    ... finally, ealier records come sorted first.

    Thereby, the desired row comes first and is picked by DISTINCT. All done in one go.

  • If you rewrite the expression (loaded_date_time::date = now()::date) to

    (loaded_date_time >= now()::date AND
     loaded_date_time < (now()::date + 1))   -- note: < not: <=
    

.. it might be faster with a plain index on loaded_date_time. Because, if you have an expression on both sides of the equal sign, you certainly can't use a plain index at all. You should have an index on widget_id (obviously) and probably another on loaded_date_time:

CREATE INDEX foo_idx ON widgets (loaded_date_time)

A multi-column index may be a bit faster:

CREATE INDEX foo_idx ON widgets (widget_id, loaded_date_time);

Test with EXPLAIN ANLYZE, whether it gets used. It should be, I didn't test. If it doesn't, no point to have it.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • As always beautifully written. But I think you badly misunderstood the question. -1 – Clodoaldo Neto Oct 15 '12 at 19:18
  • @Clodoaldo: Hmmm, I reread the question and my answer, found a typo, but otherwise ... I think my answer fits - in particular the first, simple one. I actually think all other answers (me included at first) have been over-thinking it. – Erwin Brandstetter Oct 15 '12 at 19:34
  • I had read the _One row per widget_ only at first (it was posted first) and it [does not work](http://www.sqlfiddle.com/#!1/25ba4/9) as intended. Yes the simplified version is correct. – Clodoaldo Neto Oct 15 '12 at 19:42
  • @Clodoaldo: Well, it does work as advertised, it returns "one row per widget". Wasn't sure whether the OP really just wants 1 row total, which is really very simple. Hence my comment to the question. So I provided an answer for both variants. Also ran a [comparison based on your sqlfiddle](http://www.sqlfiddle.com/#!1/25ba4/13). Same result, just simpler / faster. But that seems obvious by now. – Erwin Brandstetter Oct 15 '12 at 19:50
  • The OP is not very clear. `one row per widget`: what is a widget, is it identified by widget_id, or by widget_name? The data shown does not have multiple load_dates per widget{id,name}, so maybe every row is unique enough to report them all in the query. – wildplasser Oct 15 '12 at 21:48
1

Break your query up into two queries and union them, using a not exists() to handle the logic:

select * from widget
where <primary criteria>
union
select * from widget
where <secondary criteria>
and not exists (
    select * from widget
    where <primary criteria>
)

Coding it like this is also easy to read and maintain, especially if you add yet another way of selecting categories later on.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • I undid the downvote as I do not use to vote in questions where I answer. But you know why. – Clodoaldo Neto Oct 15 '12 at 19:19
  • 1
    @Clodoaldo No, sorry. I have no idea why you downvoted, that's why I asked. I believe my answer to be a good one and I would absolutely code this way if I were writing it for myself. I am interested to know though, because I may learn something from you - I would appreciate knowing your reason. – Bohemian Oct 15 '12 at 19:27
  • You have nothing to learn from me and you know it too. Given the OP's obvious lack of SQL knowledge real code is necessary. Your answer fits between an answer and a comment. – Clodoaldo Neto Oct 15 '12 at 19:48
  • 1
    @Clodoaldo Why is this not an answer? I'm curious... I would be happy to use this code in a production environment. – Bohemian Oct 15 '12 at 19:58
1

This is an application for ranking functions. The key is the order by:

select w.*
from (select w.*,
             row_number() over (partition by widget_id
                                order by isToday, datetime desc) as seqnum
      from (select w.*,
                   (case when cast(widget_loaded_date_time as date) = cast(now() as date)
                         then 0
                         else 1
                    end) as isToday
            from widgets w
           ) w
     ) w
where seqnum = 1
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

SQL Fiddle

select w.id, w.name, w.loaded_date_time
from (
    select min(loaded_date_time) loaded_date_time
    from widgets
    where loaded_date_time::date = current_date
    union
    select min(loaded_date_time) loaded_date_time
    from widgets
    order by loaded_date_time desc
    limit 1
) s inner join widgets w on w.loaded_date_time = s.loaded_date_time
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260