I was looking a way to select the first item from a GROUP BY
in PostgreSQL, until I find this stackoverflow: Select first row in each GROUP BY group?
There, I see that the WITH
command was used.
I'm trying to understand some more "advanced" commands of SQL, like PARTITION
, WITH
, ROW_NUMBER
etc. Until two or three months ago, I known only the basic commands (SELECT
, INNER JOIN
, LEFT JOIN
, ORDER BY
, GROUP BY
, etc);
I have a little problem (resolved, but I don't known if this is the better way* to do).
*better way = I'm more concerned about a clean SQL code than the performance - this is just for a reports that will be executed once a day, and no more than 5000 records.
I have two tables, in PostgreSQL:
+----------------------------------------------+
| TABLE NAME: point |
+--------+---------------+----------+----------+
| km | globalid | lat | long |
+--------+---------------+----------+----------+
| 36600 | 1553E2AB-B2F8 | -1774.44 | -5423.58 |
| 364000 | 25EB2465-1B8A | -1773.42 | -5422.03 |
| 362000 | 5FFDE611-88DF | -1771.80 | -5420.37 |
+--------+---------------+----------+----------+
+---------------------------------------------------------+
| TABLE NAME: photo |
+--------------+---------------+------------+-------------+
| attachmentid | rel_globalid | date | filename |
+--------------+---------------+------------+-------------+
| 1 | 1553E2AB-B2F8 | 2015-02-24 | photo01.jpg |
| 2 | 1553E2AB-B2F8 | 2015-02-24 | photo02.jpg |
| 405 | 25EB2465-1B8A | 2015-02-12 | photo03.jpg |
| 406 | 25EB2465-1B8A | 2015-02-12 | photo04.jpg |
| 407 | 25EB2465-1B8A | 2015-02-13 | photo06.jpg |
| 3 | 5FFDE611-88DF | 2015-02-12 | photo07.jpg |
+--------------+---------------+------------+-------------+
So, for the problem:
Every point
has one or more photos, but I only need the point
data, and first and the last photo
. If point
has only one photo
, I need only the first photo
. If point
has three photos
, I need only the first and the third photo
.
So, how I resolved:
First, I need the first photo
of every point
, so, I grouped by rel_globalid
, and numbered every photo by group:
WITH photos_numbered AS (
SELECT
rel_globalid,
date,
filename,
ROW_NUMBER()
OVER (
PARTITION BY rel_globalid
ORDER BY date
) AS photo_num
FROM
photo
)
With this code, I can get the 2th, 3th and so on too.
Ok, so, now, I want to get the first photo (still using the WITH
above):
SELECT *
FROM
photos_numbered
WHERE
photo_num = 1
And to get the last photo, I used the following SQL:
SELECT
p1.*
FROM
photos_numbered p1
JOIN (
SELECT
rel_globalid,
max(photo_num) photo_num
FROM
photos_numbered
GROUP BY
rel_globalid
) p2
ON
p1.rel_globalid = p2.rel_globalid AND
p1.photo_num = p2.photo_num
WHERE
p1.photo_num > 1
The WHERE p1.photo_num > 1
is because if point
has only one photo
, this photo
will appear as first photo, and last photo will be NULL
.
OK, now I must "convert" the SELECT
for the first photo
and the last photo
to a WITH
, and do a simple SELECT
with a INNER JOIN
for the first photo
and a LEFT JOIN
for the last photo
:
WITH photos_numbered AS (
SELECT
rel_globalid,
date,
filename,
ROW_NUMBER()
OVER (
PARTITION BY rel_globalid
ORDER BY date
) AS photo_num
FROM
photo
), first_photo AS (
SELECT *
FROM
photos_numbered
WHERE
photo_num = 1
), last_photo AS (
SELECT p1.*
FROM
photos_numbered p1
JOIN (
SELECT
rel_globalid,
max(photo_num) photo_num
FROM
photos_numbered
GROUP BY
rel_globalid
) p2
ON p1.rel_globalid = p2.rel_globalid AND
p1.photo_num = p2.photo_num
WHERE
p1.photo_num > 1
)
SELECT DISTINCT
point.km,
point.globalid,
point.lat,
point."long",
first_photo.date AS fp_date,
first_photo.filename AS fp_filename,
last_photo.date AS lp_date,
last_photo.filename AS lp_filename
FROM
point
INNER JOIN
first_photo
ON
first_photo.rel_globalid = point.globalid
LEFT JOIN
last_photo
ON
last_photo.rel_globalid = point.globalid
ORDER BY
km
I think this SQL is huge for a 'simple thing'!
Is working? Yes, but I want some advices, some documentations that I can read and understand better, some commands that maybe I can use to make a "better" SQL (like I said, about two or three months ago I don't even know the PARTITION
and WITH
commands).
I tried to put a link for SQLFiddle here, but SQLFiddle never worked for me (always return 'oops' message).