5

I have a query I need to run that returns the most recently updated row for each client.

In SQL Server, I would do the following:

SELECT * 
FROM 
(
    SELECT *, ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY date_updated DESC) AS rn
    FROM client_address
) a
WHERE a.rn = 1

Is there a similar way to do this on Intersystems Cache? I'm not finding any documentation for any type of ranking function.

Alex Bello
  • 199
  • 2
  • 20

4 Answers4

4

I looked at the docs and the doesn't appear to be any of the Window functions that exist in SQL Server, Oracle or Postgres so you're stuck with the ANTI-THETA-SELF-JOIN solution.

SELECT *
FROM 
   client_address a
   LEFT JOIN client_address b
   on a.client_id  = b.client_id 
     and a.date_updated < b.date_updated 
WHERE
   b.client_id is null
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
3

See the documentation for HAVING. Here's how to use it in this case:

SELECT *
FROM client_address
GROUP BY client_id
HAVING date_updated = MIN(date_updated)
  • 1
    Does `SELECT * FROM t GROUP BY ID` work? What does it do for the fields not specified in the group by? – Conrad Frix Jul 11 '16 at 14:38
  • There is a discussion about the query on InterSystems Developer Community https://community.intersystems.com/post/top-group-sql-query-cach%C3%A9 – Evgeny Shvarov Jul 11 '16 at 23:01
0

You can use %vid variable. For example:

SELECT *, %vid FROM (SELECT * FROM Sample.Person) WHERE %vid BETWEEN 5 AND 10

would return rows 5-10 from Sample.Person table.

Documentation.

Discussion on InterSystems Caché developer community.

rfg
  • 1,331
  • 1
  • 8
  • 24
  • Can you give a little more explanation on how that solves my problem? I ran this and only got five rows, with the dataset including more than one row for the same client, which is not the behavior I was looking for. – Alex Bello Jul 05 '16 at 18:25
  • This is a top 1 per group problem not a paging problem – Conrad Frix Jul 05 '16 at 20:30
0
SELECT *
FROM client_address a
WHERE
a.date_updated = (
  SELECT max(b.date_updated) FROM client_address b
  group by b.Client_id
)
Tyler2P
  • 2,324
  • 26
  • 22
  • 31
CoSpringsGuy
  • 1,615
  • 1
  • 13
  • 16
  • I don't think this will work. There's nothing joining the subquery to the main query aside from the maximum date. Perhaps it would work as an inner join instead if you include the client_id in the subquery and join on both the client_id and max_date. – Alex Bello Jul 06 '16 at 13:35
  • @AlexBello yeah `SELECT * FROM client_address a INNER JOIN (SELECT b.Client_id, max(b.date_updated) max_updated FROM client_address b group by b.Client_id) c ON a.Client_id = c.Client_id AND a.date_updated = c.date_updated` should work as well. – Conrad Frix Jul 06 '16 at 13:52
  • Not sure I understand why it wouldn't work but I will try Conrads solution on the similar queries I have been using and see if I get the same results or perhaps better performance. Thanks! – CoSpringsGuy Jul 06 '16 at 14:07
  • @CoSpringsGuy The OP want's the client_address that was last updated *for each* client. Your query returns the last updated client address which would typically be one record. Consider the [two results here](http://data.stackexchange.com/stackoverflow/query/508575/max-vs-max-join?opt.textResults=true) Note: You need to click "Run query" to see the text output – Conrad Frix Jul 06 '16 at 15:17
  • Thanks for helping me understand that Conrad. Sorry for piggy backing your post Alex – CoSpringsGuy Jul 06 '16 at 18:07