0

I have some problems with query in SQL. I have 2 tables.

people
    +----+--------+------+
    | id |  name  | val2 |
    +----+--------+------+
    |  1 | john   |   12 |
    |  2 | adam   |    5 |
    |  3 | alfred |    3 |
    +----+--------+------+

data
+----+----+----+-----+---------------------+
| id | v1 | v2 | v3  |        date         |
+----+----+----+-----+---------------------+
|  1 |  4 | 15 |  18 | 2020-10-16 11:15:53 |
|  1 |  2 | 12 |  17 | 2020-10-16 11:22:53 |
|  1 |  3 | 13 |  16 | 2020-10-16 11:32:53 |
|  2 |  1 | 16 |  15 | 2020-10-16 13:22:53 |
|  2 |  3 | 13 |  25 | 2020-10-16 13:42:53 |
|  2 |  4 | 12 |  35 | 2020-10-16 14:12:53 |
|  3 |  1 | 21 | 12  | 2020-10-16 14:12:53 |
|  3 |  2 | 28 | 42  | 2020-10-16 15:12:53 |
|  3 |  4 | 30 | 72  | 2020-10-16 16:12:53 |
+----+----+----+-----+---------------------+

I need to get in one table ID, NAME, v1,v2,v3,date for the new date to all object from first table something like this:

RESULT
+----+--------+----+----+-----+---------------------+
| id |  name  | v1 | v2 | v3  |        date         |
+----+--------+----+----+-----+---------------------+
|  1 | john   |  3 | 13 |  16 | 2020-10-16 11:32:53 |
|  2 | adam   |  4 | 12 |  35 | 2020-10-16 14:12:53 |
|  3 | alfred |  4 | 30 | 72  | 2020-10-16 16:12:53 |
+----+--------+----+----+-----+---------------------+

I need the newest record from SECOND TABLE for all people from first table. I try do it by this query:

SELECT people.id,
   people.name,
   data.v1,
   data.v2,
   data.v3,
   max(data.date)
FROM people
JOIN DATA ON people.id = data.id
GROUP BY people.id

I got the newest data but v1, v2, v3 is random from table.

Jester48
  • 33
  • 7

2 Answers2

0

One simple method uses window functions:

SELECT p.id, p.name, d.v1, d.v2, d.v3, d.date)
FROM people p JOIN
     (SELECT d.*,
             ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY d.date DESC) as seqnum
      FROM data d
     ) d
     ON p.id = d.id AND d.seqnum = 1;

Note: It seems strange that the join column in data would be id. I would expect it to be called something like people_id.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

You want entire rows from data, so aggregation is not an option here. In most databases, your query would fail, because the select and group by clause are not consistent... But MySQL, somehow unfortunaltely, gives you enough rope to developers to to hang themselves with. Your query runs (if sql mode ONLY_FULL_GROUP_BY is disabled), but is actually equivalent to:

SELECT people.id, people.name, ANY_VALUE(data.v1), ANY_VALUE(data.v2), ANY_VALUE(data.v3), MAX(data.date)
FROM people
JOIN data on people.id = data.id
GROUP BY people.id

Now it is plain to see that the database gives you any value of data rows that match the join condition - which may, or may not belong to the row that has the latest date.

Instead of grouping, you actually need to filter. One option uses a subquery:

select p.id, p.name, d.v1, d.v2, d.v3, d.date
from people p
inner join data d on d.id = p.id
where d.date = (select max(d1.date) from data d1 where d1.id = d.id)

The upside of this approach is that it works in all versions of MySQL, including pre-8.0, where window functions are not available.

GMB
  • 216,147
  • 25
  • 84
  • 135