0

I am trying to write a query that will group by different combination of certain columns and give me 'N' records for each combination of the columns. I have been able to categorize it using a single column, but I am unable to get the grouping correct when it involves multiple columns. Below is an example of how the data in the table looks like and the output I am expecting. Here 'N' = 2.

enter image description here

Expected output :

  1. max 2 records for each unique combination of columns
  2. which 2 records are selected is not important.
  3. If there are < N records, then return all rows (i.e the no of records should not exceed 'N', but can be <N

enter image description here

UPDATE : Was able to get the desired output. My question now is - if there is any further efficient way to do it, given I will need to run it on a table containing several columns and the number of combinations will quickly increase.

WITH SUBSET AS (
    SELECT [ServerName],[SKU],[Feature1],[Feature2],[Feature3],ROW_NUMBER() OVER (
          PARTITION BY [SKU], [Feature1], [Feature2],[Feature3] order by [ServerName]) AS [ROW NUMBER]
      FROM [tablename]
)
SELECT ServerName ,SKU ,Feature1 ,Feature2 ,Feature3 FROM [SUBSET]
WHERE [SUBSET].[ROW NUMBER] <= 2
Dale K
  • 25,246
  • 15
  • 42
  • 71
user5566364
  • 173
  • 2
  • 12
  • As per the question guide, please do not post images of code, data, error messages, etc. - copy or type the text into the question. Please reserve the use of images for diagrams or demonstrating rendering bugs, things that are impossible to describe accurately via text. – Dale K Jun 21 '23 at 01:34
  • "but I am unable to get the grouping correct" - please show us your query – Dale K Jun 21 '23 at 01:34
  • a bit unclear,you could do multiple queries with different ```group by``` clauses and then union all putting the nulls in the right place,or ```grouping sets``` in one go....it depends really on your end goal....check these and place some tries here. – terrybozzio Jun 21 '23 at 01:45
  • 1
    For efficiency sake, you probably want an index `(SKU, Feature1, Feature2, Feature3, ServerName)` – Charlieface Jun 21 '23 at 10:11

1 Answers1

1

Use the ROW_NUMBER() function in the SQL query to assign a sequential number to each row within the specified grouping. After that filter the result to only include rows where the row number is less than or equal to 'N'.

      SELECT column1, column2, column3, column4, column5
      FROM (
        SELECT column1, column2, column3, column4, column5,
        ROW_NUMBER() OVER (PARTITION BY column1, column2, column3 
              ORDER BY column4, column5) AS row_num
        FROM custom_table
           ) t
      WHERE row_num <= 2
      ORDER BY column1, column2, column3, column4, column5;

The PARTITION BY clause specifies the columns which are needed to group by, and the ORDER BY clause determines the order in which the rows will be numbered within each group.

DebugCode
  • 50
  • 10