-2

I have a table_A with data extracted from service cloud (salesforce) about customers (name, email, customer_nr, etc), but for the same email I have more than one entry. This happens due to different reasons, but I want to select only the row that follows a certain rule, because I can only have one email per person.

Table_A:

Email CustomerType First_Name Last_Name Owner_FirstName Owner_LastName
testA@email.com CustomerServiceAdd NULL NULL Patricia Portela
testA@email.com RegisteredSubscriber John Smith User SCCIntegration

So basically when I group by the email and having count(*)>1, it needs to follows these rules:

  1. I want that when there is more than one entry for the same email, it selects the row where CONCAT([Owner.FirstName],[Owner.LastName])='UserSCCIntegration'

  2. When for the same email, there's more than one row where CONCAT([Owner.FirstName],[Owner.LastName])='UserSCCIntegration', then should be selected the row where SFCC_Customer_Number__pc (this column was not selected for table_A, but it exists in my datatable) is not NULL.

I tried using a CASE WHEN but didn't manage to produce any result, so I am wondering how should the query be created and if using CASE WHEN clause is the right way to go.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
Tiago
  • 37
  • 6
  • Please read [this](https://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/) for some tips on improving your question. It's a lot easier for us to help you fix your code if we can see it and have a minimal, complete example: DDL, sample data and expected results. – HABO Mar 23 '23 at 12:23
  • @Stu it doesn't because on that case the user only wants to select the top 1 row. I want to select according to a criteria, so that when when I group by the email and having count(*)>1, it should be considered the row where CONCAT([Owner.FirstName],[Owner.LastName])='UserSCCIntegration' – Tiago Mar 23 '23 at 13:27
  • When for the same email, there's more than one row where CONCAT([Owner.FirstName],[Owner.LastName])='UserSCCIntegration', then should be selected the row where SFCC_Customer_Number__pc (this column was not selected for table_A, but it exists in my datatable) is not NULL. – Tiago Mar 23 '23 at 13:28
  • @HABO thank you, but unfortunately I don't have the code. I tried using CASE WHEN but without success. I only know what are my conditions but I'm struggling to implement them. – Tiago Mar 23 '23 at 13:29
  • You misunderstand that post. The answers there show how to select the top 1 per group, you just need to define your group and ordering correctly. In this case as @JoelCoehoorn shows, the grouping is `Email` and the ordering is a complex `CASE` expression – Charlieface Mar 23 '23 at 14:34

1 Answers1

1

Use the row_number() windowing function. First add this to the SELECT list:

row_number() over (PARTITION BY Email 
                   ORDER BY case when CONCAT([Owner.FirstName],[Owner.LastName])='UserSCCIntegration' then 0 else 1 end,
                            case when SFCC_Customer_Number__pc is not null then 0 else 1 end
                  ) rn

Then nest the whole query in another SELECT statement and filter based on the new rn column:

SELECT *
FROM (
  -- Original query goes here

) t
WHERE t.rn = 1

This assumes you've already added the necessary JOINs to make the SFCC_Customer_Number__pc value available.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794