0

I have a result of a union of two tables like this:

UserId  AppId   Status
------------------------
A1      5       INACTIVE
A1      5       SUSPENDED
A1      6       REVOKED
A1      6       SUSPENDED
A1      6       ACTIVE
A1      7       REVOKED
A1      8       SUSPENDED
B2      7       INACTIVE
B2      7       INACTIVE
B2      7       ACTIVE
C3      6       INACTIVE

How can I eliminate the rows with duplicate UserId and AppId and return with just one Status, with an order of preference to return: 1. ACTIVE 2. SUSPENDED 3. REVOKED 4. INACTIVE

UserId  AppId   Status
------------------------
A1      5       SUSPENDED
A1      6       ACTIVE
A1      7       REVOKED
A1      8       SUSPENDED
B2      7       ACTIVE
C3      6       INACTIVE
Xstaci
  • 67
  • 1
  • 10

2 Answers2

1

This usual problem is referred to as "selecting the top row by group". The query looks something like this:

WITH comp AS (
    SELECT t.UserId, 
        t.AppId, 
        t.Status, 
        ROW_NUMBER() OVER(PARTITION BY t.UserId, t.AppId 
            ORDER BY CASE 
                WHEN t.Status = 'ACTIVE' THEN 1
                WHEN t.Status = 'SUSPENDED' THEN 2
                ...
            END CASE ASC) AS rank
      FROM ??? AS t)
SELECT *
  FROM comp
 WHERE rank = 1

For more on this topic: Select first row in each GROUP BY group?

Dinu
  • 1,374
  • 8
  • 21
0

Create a table for status precedence with a column for the status, and a number between 1 and 4 like you listed in your preferences. Join that table to your set, and use a MAX aggregate function with a GROUP BY UserId and AppId to find which highest precedence exists for each user and in turn use that as a filter.

You didn’t specify which database engine you use, and you didn’t provide DDL and sample data, so we can’t give you a working solution, just high level guidelines to get you in the right direction.

HTH

SQLRaptor
  • 671
  • 4
  • 14