0

I have SQL server query to fetch data from SQL Server to Python data frame as below. I need to keep only unique rows based on two columns date & counter

SELECT 
    CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) AS date,
    [A2_TIRE_A_CO_D362_VALUE] AS counter,
    [A2_TIRE_A_CO_D400_VALUE] AS D400,
    [A2_TIRE_A_CO_D402_VALUE] AS D402,
    [A2_TIRE_A_CO_D412_VALUE] AS D412,
    [A2_TIRE_A_CO_D414_VALUE] AS D414,
    [A2_TIRE_A_CO_D416_VALUE] AS D416,
    [A2_TIRE_A_CO_D420_VALUE] AS D420,
    [A2_TIRE_A_CO_D422_VALUE] AS D422,
    [A2_TIRE_A_CO_D432_VALUE] AS D432
FROM 
    [aaaa2_tttt_a].[dbo].[tttt_a] 
WHERE
    [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour, -1, GETDATE()) 
    AND [A2_TIRE_A_CO_L102_VALUE] = 1 
    AND [A2_TIRE_A_CO_L100_VALUE] = 1
| date       | counter | D400        | D402                         | D412        |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |

Requirement

| date       | counter | D400        | D402                         | D412        |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |

I tried distinct method and other sources. How can I do it?

I tried to create new column by concatenating date and counter and applying distinct:

DISTINCT(CONCAT(CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE]) AS unique_column)

but I think it is not correct way

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
user_v27
  • 423
  • 4
  • 12
  • I tried to create new column by concatenating date and counter and applying distinct: distinct(concat(CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE]) as unique_column). but i think it is not correct way. unnecessary to create new column. – user_v27 Apr 23 '22 at 05:13
  • edited in question – user_v27 Apr 23 '22 at 05:18
  • I think in that question they ask for top row based on one column. My question is i want rows based on two columns – user_v27 Apr 23 '22 at 05:19
  • Same principle, you just adapt row_number to your needs. – Dale K Apr 23 '22 at 05:26

1 Answers1

3

You can try to use and ORDER BY window function of ROW_NUMBER.

partition by might need to put which column you want to represent unique row.

SELECT TOP 1 WITH 
    CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
    [A2_TIRE_A_CO_D362_VALUE] as counter,
    [A2_TIRE_A_CO_D400_VALUE] as D400,
    [A2_TIRE_A_CO_D402_VALUE] as D402,
    [A2_TIRE_A_CO_D412_VALUE] as D412,
    [A2_TIRE_A_CO_D414_VALUE] as D414,
    [A2_TIRE_A_CO_D416_VALUE] as D416,
    [A2_TIRE_A_CO_D420_VALUE] as D420,
    [A2_TIRE_A_CO_D422_VALUE] as D422,
    [A2_TIRE_A_CO_D432_VALUE] as D432
FROM [aaaa2_tttt_a].[dbo].[tttt_a] 
where 
    [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour,  -1,  GETDATE()) 
and 
    [A2_TIRE_A_CO_L102_VALUE] =1 
and 
    [A2_TIRE_A_CO_L100_VALUE] =1
order by row_number() over (partition by CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE] order by date)

but I think you can use DISTINCT directly from your sample data and expect result.

SELECT DISTINCT CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
    [A2_TIRE_A_CO_D362_VALUE] as counter,
    [A2_TIRE_A_CO_D400_VALUE] as D400,
    [A2_TIRE_A_CO_D402_VALUE] as D402,
    [A2_TIRE_A_CO_D412_VALUE] as D412,
    [A2_TIRE_A_CO_D414_VALUE] as D414,
    [A2_TIRE_A_CO_D416_VALUE] as D416,
    [A2_TIRE_A_CO_D420_VALUE] as D420,
    [A2_TIRE_A_CO_D422_VALUE] as D422,
    [A2_TIRE_A_CO_D432_VALUE] as D432
FROM [aaaa2_tttt_a].[dbo].[tttt_a] 
where 
    [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour,  -1,  GETDATE()) 
and 
    [A2_TIRE_A_CO_L102_VALUE] =1 
and 
    [A2_TIRE_A_CO_L100_VALUE] =1
D-Shih
  • 44,943
  • 6
  • 31
  • 51