3

is it possible to use GROUP BY with a preference for records?

For instance I have whole bunch of contact data that may or may not contain all info - in a CSV sense if might look like this:

Test User, Address1, Address2, test@test.com  
, , , test@test.com

If I was to GROUP BY email, I would love to extract the more relevant record.

Hope that makes sense?

Yours, Chris

Gustav Bertram
  • 14,591
  • 3
  • 40
  • 65
ChrisS
  • 719
  • 1
  • 8
  • 7
  • What do you mean by "more relevant record" ? Another row with an identical email? Or another column? Or? – Andomar Nov 16 '09 at 12:32
  • Hi, supposed those two rows above where in the database - I'd want to pull out the one with contact information rather than the row with empty fields. – ChrisS Nov 16 '09 at 12:44

4 Answers4

2

You could use an aggregate function for getting the more 'relevant' record for each email.
I think this query would get you the best result:

SELECT emailAddress, max(concat(fullName,',',address1,',',address2))
FROM table
GROUP BY emailAddress

It will return the richest row for each email address but all data will be returned within one string (comma-separated) so you will have to parse it somehow.
If performance is no issue and you'd like to get a normal result set in separate fields then you could go with one:

SELECT table.emailAddress, fullName, address1, address2 
FROM 
table JOIN 
    (SELECT emailAddress, 
       max(concat(fullName,address1,address2)) as bestRowInOneString
    FROM table
    GROUP BY emailAddress
    ) bestRowsSubQuery 
 ON 
   concat(table.fullname,table.address1,table.address2) = bestRowsSubQuery.bestRowInOneString
   AND table.emailAddress = bestRowsSubQuery.emailAddress
grateful.dev
  • 1,437
  • 10
  • 21
1

For each email, this query will select the record with the most fields set:

SELECT  *
FROM    (
        SELECT  DISTINCT email
        FROM    mytable
        ) mi
JOIN    mytable mo
ON      mo.id =
        (
        SELECT  id
        FROM    mytable mf
        WHERE   mf.email = mi.email
        ORDER BY
                username IS NULL + address1 IS NULL + address2 IS NULL DESC
        LIMIT 1
        )
Quassnoi
  • 413,100
  • 91
  • 616
  • 614
0

GROUP BY SUBSTRING(field,N);

Haim Evgi
  • 123,187
  • 45
  • 217
  • 223
0

Not sure what you mean, but in some (most?) SQL dialects you can combine GROUP BY with CASE logic:

GROUP BY case when col1 = 'x' then col1 else col2 end

(like I said, I'm not sure if that is valid for MySql).

EDIT: if that's not valid for MySql, you could always extract this bit of logic

GROUP BY case when col1 = 'x' then col1 else col2 end

into a view and SELECT from that view, GROUPing on the column containing the logic e.g.

select * from
(
   select ....
   , (case ....) as logic_col
   from ...
) x
group by logic_col
davek
  • 22,499
  • 9
  • 75
  • 95