Select only rows with a different cell

Question

I have the following table (simplified):

 CUSTOMER_ID NAME     PHONE      AGE    HEIGHT     
        1    MANOLO   987456321  56     1.80         
        1    MANOLO   NULL       56     1.79         
        2    LUCÍA    NULL       56     1.50         
        2    LUCÍA    987456321  56     1.50

And the result I expect is to select only the cells whose values are different from each other, i.e. I expect the result :

ID_CLIENT NAME    PHONE      AGE     HEIGHT     
1         -       987456321  -       1.80         
1         -       NULL       -       1.79         
2         -       NULL       -       -        
2         -       987456321  -       -

I have tried using a join with the table itself, but the result stays nested and not in different rows. I have a feeling that some kind of join or intersection must be used, as the closest result to what I am being asked for was achieved by using UNION:

SELECT '-' AS NAME , PHONE , 0,
  FROM [test_temp].[dbo].[CUSTOMER3].       
  UNION     
   SELECT '-' AS NAME , 0 AS PHONE, HEIGHT, 0, FROM [TEST_TEMP].
  FROM [test_temp].[dbo].[CUSTOMER3] UNION select '-' as name , 0 as phone, height

But the results are often incomplete or with duplicates:

TELEPHONE NAME (No column name)
- NULL 0.00
- 0 1.45
- 0 1.50
- 0 1.56
- 0 1.70
- 0 1.74
- 0 1.80
- 0 1.90
- 0 1.96
- 987456321 0.00

Greetings

saving age you need to update every year the column, save only the birthyears, the rest is simple math — nbk, Mar 22 '23 at 12:44
Those square bracket are not valid Postgres syntax (nor standard SQL). You may be thinking of certain Microsoft products. — Erwin Brandstetter, Mar 22 '23 at 12:50
Are here *always* two rows per ID? Can any fields be NULL? Be very clear, please. — Erwin Brandstetter, Mar 22 '23 at 12:56

score 1 · Accepted Answer · answered Mar 22 '23 at 12:49

If I understand correctly, you need something like this:

Select
    s1.CUSTOMER_ID as ID_CLIENT,
    case
        when exists (
            Select
                1
            from
                simplified s2
            where
                s2.name = s1.name
                and s2.CUSTOMER_ID <> s1.CUSTOMER_ID
        ) then '-'
        else s1.name
    end as name,
    --more of cases for columns you need
from
    simplified s1

Here column name is checked for duplicates, and if found - replaced with -. For every column with this duplicates rule you'll need to repeat this case (changing name to respective column).

score 1 · Answer 2 · answered Mar 22 '23 at 13:04

Assuming there are always exactly 2 rows with the same ID and columns can be null:

SELECT a.customer_id
     , CASE WHEN a.name  IS DISTINCT FROM b.name  THEN a.name::text  ELSE '-' END AS name
     , CASE WHEN a.phone IS DISTINCT FROM b.phone THEN a.phone::text ELSE '-' END AS phone
      -- etc.
FROM   customer3 a
JOIN   customer3 b USING (customer_id)
WHERE  a.ctid <> b.ctid;

Notably, we need a cast to text (or similar) to allow '-' in the result for other data types.

IS DISTINCT FROM covers null values too.

ctid is a poor man's replacement for a primary key in Postgres. See:

Find rows with all columns duplicated and no unique field in PostgreSQL

Thank you very much, it has been a great help to solve my problem with duplicate rows. — Álvaro Navarro, Mar 22 '23 at 14:46

Select only rows with a different cell

2 Answers2