-2

I am trying to make a query that groups me by the ID_Country, Year_, Employee_ID columns and applying a max to the Employee_Name column so that it returns the employee name with more characters but for each execution it returns different results.

table with employee data: enter image description here

I am grouping to remove duplicate records the problem is that the employee name is registered in a different way

The expected result is this, so that it always returns the employee name with the largest number of characters

enter image description here

this is my query:

SELECT DISTINCT [ID_Country] ,[Year_] ,[Employee_ID] ,MAX([Employee_Name]) AS Employee_Name
FROM Employee_Tbl
GROUP BY [ID_Country],[Year_],[Employee_ID]

Can someone tell me the correct way to do the query and it always returns the employee name with the largest number of characters?

Thom A
  • 88,727
  • 11
  • 45
  • 75
  • What is your question here? You imply you have an attempt but it isn't working; what is that attempt, why isn't is working? What are your expected results here? – Thom A May 07 '22 at 16:45
  • I want to group by the columns [ID_Country],[Year_],[Employee_ID] to remove duplicate records and from the Employee_Name column bring the name of the employee with the highest number of characters if there is the same number of characters bring the first record – Laurent Ress May 07 '22 at 16:56
  • the question is how can I make my query always return the name that has more characters – Laurent Ress May 07 '22 at 16:57
  • 1
    But that's what I mean, what is "first"? Ordered by what? A table is an unordered bag of rows, so let's make sure a solution considers "first" the same way you do. (Also, `DISTINCT` _and_ `GROUP BY` really, really, really don't ever belong together.) – Aaron Bertrand May 07 '22 at 17:07
  • Does this answer your question? [Get top 1 row of each group](https://stackoverflow.com/questions/6841605/get-top-1-row-of-each-group) – Thom A May 07 '22 at 17:12

1 Answers1

1

You can use ROW_NUMBER() to assign a row number based on country and year, ranked by the length of the name descending:

SELECT ID_Country, Year_, Employee_ID, Employee_Name
FROM
(
  SELECT ID_Country, Year_, Employee_ID, Employee_Name, 
    rn = ROW_NUMBER() OVER 
    (
      PARTITION BY ID_Country, Year_ 
      ORDER BY LEN(Employee_Name) DESC, Employee_ID
    )
  FROM dbo.Employee_Tbl
) AS Employees WHERE rn = 1;

There is some guessing there about how you'd want to break ties (or maybe you don't care), but there is no concept of "first" unless you state an order explicitly.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490