-1

I have to translate a query provided to me into SQL Server query. It uses max_by function which essentially does this.

MAX_BY takes two arguments and returns the value of the first argument for which the value of the second argument is maximized. If multiple rows maximize the result of the second value, and arbitrary first value is chosen from among them. MAX_BY can be used with both numeric and non-numeric data.

https://docs.data.world/documentation/sql/reference/aggregations/max_by.html

A quick sample problem query that needs to be translated

SELECT 
    max_by(score, dob)
FROM
    table_1 t1
    INNER JOIN table_2 t2 ON t1.id = t2.id

I cannot do this processing in the code and need to do this in the query. Kindly suggest if there is a equivalent function in SQL Server that I can use directly or the simplest work around if a function is not available

Shadow
  • 33,525
  • 10
  • 51
  • 64
Nitin Nanda
  • 805
  • 2
  • 11
  • 27
  • 1) Please only tag the RDBMS you want a solution in. 2) Please describe what research yoi have done and where you got stuck. – Dale K Feb 23 '22 at 22:19
  • Interestingly despite that function being in those docs, I can't find a RDBMS that supports it anyway. – Dale K Feb 23 '22 at 22:22
  • 2
    Wouldn't a standard windowing function work: MAX(score) OVER(ORDER BY dob DESC). Or - maybe LAST_VALUE(score) OVER(ORDER BY dob). If you can provide some sample data and expected results I think someone here can provide a working solution. – Jeff Feb 23 '22 at 22:38
  • 1
    Noting those docs are poor, *and arbitrary first value is chosen from among them* is a grammatical typo. – Stu Feb 23 '22 at 22:46
  • I don't think you understand what max_by does, either @Jeff or @Shadow. It's an aggregate function, and it would return 'b' from the following query because the associated y value for x = 'b' is the max within group `SELECT MAX_BY(x, y) FROM (VALUES ('a', 10), ('b', 50), ('c', 20)) AS tab(x, y)` – MarcinJ Feb 23 '22 at 22:47
  • @MarcinJ - so would MAX_BY return a single row? You can get the same results using TOP 1 and ORDER BY: Select Top (1) tab.x From (Values ('a', 10), ('b', 50), ('c', 20)) As tab(x, y) Order By tab.y desc; – Jeff Feb 25 '22 at 21:38
  • @Jeff yes, it's an aggregate function, and therefore has support for group by, I assumed it's not just the simplest case of max_by returning the one row they were after but instead a more general way to replicate its functionality. – MarcinJ Feb 26 '22 at 00:29

2 Answers2

0

Given the example provided for databricks, I think the following should work:

SELECT x FROM (
   SELECT x
        , ROW_NUMBER() OVER (ORDER BY y DESC) as max_by_rank
    FROM (VALUES ('a', 10), ('b', 50), ('c', 20)) AS tab(x, y)
) sub
WHERE sub.max_by_rank = 1

Adapted for your case it'd be:

SELECT 
    score
FROM
    (SELECT score
          , ROW_NUMBER() OVER (ORDER BY dob DESC) as max_by_rank 
       FROM table_1 t1
      INNER 
       JOIN table_2 t2 ON t1.id = t2.id
    ) sub
WHERE sub.max_by_rank = 1
MarcinJ
  • 3,471
  • 2
  • 14
  • 18
0

It would be useful if you provided sample data and the expected result. It makes it a lot easier to help you.

But from your description, it doesn't seem like it should be any more complicated than something like this:

select top 1
       t1.dob,
       hi_score = max(t2.score)
from table_1 t1
join table_2 t2 ON t2.id = t1.id
group by t1.dob
order by t1.dob desc

And if you need to use it as part of a query, just make it a derived table, something like the following. Do note that I remove the top 1 restriction on the size of the result set and the superfluous order by clause:

select *
from      table_1 t1
left join (
            select dob      = t1.dob,
                   hi_score = max(t2.score)
            from table_1 t1
            join table_2 t2 ON t2.id = t1.id
            group by t1.dob
          ) t2 on t2.dob = t1.dob
Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135