19

I have a table "tbl" which looks like this:

prod | cust | qty
p1   | c1   | 5
p1   | c2   | 10
p2   | c1   | 2
p3   | c2   | 8

What I need is a distinct list of product and customer pairs but only the first customer if the product is sold to more than one customer. In order words the results need to look like this:

prod | cust
p1   | c1   
p2   | c1   
p3   | c2   

I've tried this every which way I can think of but I can't quite get the correct result. Clearly neither distinct nor group by will work (on their own) since they will both return the p1, c2 row.

I found this question which is a very close match but I can't figure out how to re-write it to get it to do what I need.

To top it all this currently needs to work in Access 2007 or later but at some future point it'll need to work in MySQL as well.

Extra credit to anyone who also joins the results to the customer table so that I can look up the human readable name from the customer code e.g. c1 => Fred Bloggs Spanners

Community
  • 1
  • 1
wobblycogs
  • 4,083
  • 7
  • 37
  • 48
  • 1
    How do you define "first customer"? Is there a date field that makes that explicit that you're not showing? – David Peden Feb 29 '12 at 18:53
  • 1
    I define "first" to be whichever gets returned by the query - I'd be happy with either p1,c1 or p1,c2 being returned I just can't cope with both being returned (because the product code is being used in another join and I don't want duplicates). There are no other useful fields in table tbl. – wobblycogs Feb 29 '12 at 19:01
  • 1
    OK, I read this comment after posting an answer -- this doesn't make sense -- you want the "first" customer, but you don't actually care who the "first" customer actually is or how they are determined? I'm not writing your requirements, but you really want essentially random results over time?!? – Mike Ryan Feb 29 '12 at 19:25
  • Ok, I should have said arbitrary single customer. I am fully aware of the fact the customer will change over time, I don't like it but I have to live with it. Basically, the other table I'm joining to, using the product key from this query, doesn't contain any customer information. Rows are identified only by product key with a total for all customers that I have to report. The client wants to see a breakout by customer and product but that's not possible - they don't have the data - so instead we've been asked to just report any valid customer against the product. – wobblycogs Feb 29 '12 at 19:48

4 Answers4

25

Core Question:

SELECT prod, MIN(cust)
FROM yourTable
GROUP BY prod

For the "Bonus":

SELECT T.prod,
       T.cust,
       YC.SomeCustomerAttribute1,
       YC.SomeCustomerAttribute2
FROM (
      SELECT prod, MIN(cust) AS first_cust
      FROM yourProducts
      GROUP BY prod
) AS T
JOIN yourCustomers AS YC ON YC.cust = T.first_cust
J Cooper
  • 4,828
  • 3
  • 36
  • 39
3

I'm going to have to assume that you have some kind of identifier that indicates who is "first". A date column or an identity column or something.

In my example, I've done it with an order_id identify column.

CREATE TABLE products (
    order_id MEDIUMINT NOT NULL AUTO_INCREMENT,
    prod char(2), 
    cust char(2),
    qty int,
    PRIMARY KEY (order_id)
);

INSERT INTO products (prod, cust, qty) VALUES
   ('p1', 'c1', 5),
   ('p1', 'c2', 10),
   ('p2', 'c1', 2),
   ('p3', 'c2', 8);

And then to get your values run:

select p1.prod, p1.cust, p1.qty
from products p1
where not exists (select * from products p2
              where p1.prod = p2.prod
              and p2.order_id < p1.order_id)

where on each line you check to see if there are any other customers that ordered it earlier than you did. If there is an earlier order, then don't list this row. (Thus the not exists)

This is mysql syntax, btw, which you say you're migrating to. (Access experts would have to edit this appropriately.)

Now, if you don't have a column identifying what designates the sequence of when orders were entered, you need one. Any schemes that rely upon implicit row_numbering based upon order of insertion will fall apart eventually, since the "first row" is not guaranteed to remain the same.

Mike Ryan
  • 4,234
  • 1
  • 19
  • 22
  • As far as I know MS Access does not support `exists` – J Cooper Feb 29 '12 at 19:25
  • You may be correct about access -- I use a lot of DBs, but not access, which I assume to be ANSI, but evidently isn't. Should have put the qualifier higher about mysql (which he says he needs eventually). I'll still emphasize my discussion that to do anything real he needs some kind of sequence/datetime identifier. But since he just wants a random result, then whatever. . . – Mike Ryan Feb 29 '12 at 19:33
  • Silly me, I wanted to go that route, but googled around a bit and it seemed as though it wasn't supported. I see it now. I should google harder before I speak! – J Cooper Feb 29 '12 at 19:35
2

if you only want the first result add LIMIT 1 to your query, or use SELECT DISTINCT to get unique results.

as for your join check this out:

SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name

this will get all the rows with a matching name from tableA and tableB.

EDIT J Cooper is right, ms-access doesn't have an equivalent to LIMIT. the closest is TOP but I don't think that will help. sorry, haven't used access since college.

JKirchartz
  • 17,612
  • 7
  • 60
  • 88
-1

"First" and min() are not the same. If you truly want first, try this:

declare @source table
(
    prod varchar(10),
    cust varchar(10),
    qty int
)

insert into @source (prod, cust, qty) values ('p1', 'c1', 5)
insert into @source (prod, cust, qty) values ('p1', 'c2', 10)
insert into @source (prod, cust, qty) values ('p2', 'c1', 2)
insert into @source (prod, cust, qty) values ('p3', 'c2', 8)

select * from @source

declare @target table
(
    prod varchar(10),
    cust varchar(10),
    qty int
)

insert into @target (prod)
select distinct prod from @source

update @target
set
    cust = s.cust,
    qty = s.qty
from @source s
join @target t on t.prod = s.prod

select * from @target
David Peden
  • 17,596
  • 6
  • 52
  • 72
  • This is SQL Server code, T-SQL. Neither the source database nor the destination are SQL Server databases. This code won't work in either. – eksortso Jan 03 '22 at 17:29