1

I need to find the customerIDs of all customers who have ordered a certain quantity of particular products. For example, I want to get all customerIDs who have ordered 2 or more cheese or 5 or more bread. This should give me 1,4 and 3

customerID  product     qty
1           cheese      3
1           bread       2
2           cheese      1
2           pizza       4
3           bread       10
3           tomato      4
4           cheese      2

It would be much easier if my table structure was altered to have a column for cheese, bread, pizza, bread and tomato. Unfortunately this isn't possible.

EDIT: The customerID/product combo is unique. I would also like to be able to search for when all conditions are met. So, where 3 or more cheese AND 2 or more bread.

Chris
  • 4,672
  • 13
  • 52
  • 93
  • How long is the list of products? – PM 77-1 Sep 28 '14 at 21:24
  • @PM77-1 long ... like 200,000 rows – Chris Sep 28 '14 at 21:25
  • How long is the list that your search criteria is based upon? – PM 77-1 Sep 28 '14 at 21:26
  • @PM77-1 The original table contains about 200,000 rows. This is made up of about 50,000 unique customerIDs – Chris Sep 28 '14 at 21:27
  • Sorry, I was not clear. I'm interested whether the maximum number of products (and their quantities) to use in a single query is known. – PM 77-1 Sep 28 '14 at 21:32
  • Ah, OK. There are about 300 products that could be queried (it will never in reality be more than 10 ever queried in one go though) – Chris Sep 28 '14 at 21:33
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/62079/discussion-between-pm-77-1-and-christopherib). – PM 77-1 Sep 28 '14 at 21:34
  • try this: http://stackoverflow.com/q/11230759/1726419 – yossico Sep 28 '14 at 21:38
  • See [MySQL pivot table](http://stackoverflow.com/questions/7674786/mysql-pivot-table) for possible solutions. – PM 77-1 Sep 28 '14 at 21:42
  • is the pair `customerID,product` unique to the table? – FuzzyTree Sep 28 '14 at 22:18
  • @FuzzyTree Yes, they are – Chris Sep 28 '14 at 22:19
  • @christopherib I recommend adding that information to the question because if the pair is unique, then Usman's answer is sufficient but if it's not then you need to use `group by ... having` – FuzzyTree Sep 28 '14 at 22:23
  • @FuzzyTree Thanks, will do this now. Usman's solution is perfect .. but I would like to be able to specify a query where any or all of the conditions are met. His works perfectly for any but not for all – Chris Sep 28 '14 at 22:25
  • @christopherib adding the following to Usman's answer will give you what you need for the `and` scenario: `group by CustomerId having count(*) = 2` . also remove `distinct` from the query – FuzzyTree Sep 28 '14 at 22:30
  • @FuzzyTree Thank you for the guidance. If I try this then I get no results. – Chris Sep 28 '14 at 22:36
  • @FuzzyTree I have created a fiddle at http://sqlfiddle.com/#!2/cfe53/2 – Chris Sep 28 '14 at 22:42
  • @christopherib your fiddle does not contain any customers where both conditions are true. if you correct the data you'll get a row back see http://sqlfiddle.com/#!2/5d8ab7/2 – FuzzyTree Sep 28 '14 at 22:50
  • @FuzzyTree Hmm, http://sqlfiddle.com/#!2/5d8ab7/3 should give me customerID 3 but it's not showing anything – Chris Sep 28 '14 at 22:53
  • @christopherib you need to use `or` (instead of `and`) in your `where` – FuzzyTree Sep 28 '14 at 22:55

3 Answers3

2
select distinct customerID from Products
where (product = 'cheese' and qty >= 2) or (product = 'bread' and qty >= 5);
  • Hello and welcome to Stack Overflow. When answering a question, please don't simply post some code. Please explain why this code works so that the questioner (OP) can use this information for other similar problems in future. – worldofjr Sep 28 '14 at 22:07
  • Hi, this works well. Can you suggest how I can run a query that meets both conditions rather than just one? Using "and" instead of "or" doesn't work – Chris Sep 28 '14 at 22:29
  • @christopherib can you explain a bit more that what exactly do you want? I think I have given answer of your original question. –  Sep 28 '14 at 23:15
  • @worldofjr I'll take care of it next time. Thanks. –  Sep 28 '14 at 23:46
2
SELECT t.customerID,
 SUM(CASE WHEN t.product='cheese' THEN quantity ELSE 0 END) as cheese,
 SUM(CASE WHEN t.product='bread' THEN quantity ELSE 0 END) as bread
FROM t
GROUP BY t.customerID
HAVING cheese>=2
OR bread>=5;

FIDDLE

Or move them to the HAVING clause for clarity

FIDDLE

Mihai
  • 26,325
  • 7
  • 66
  • 81
1

EDIT My bad... I misread the specification. The examples below returns customers that had 'cheese' qty>=2 AND 'bread' qty >=5. The specification was for OR rather than AND.

Here's an example of one approach (out of several viable approaches) to returning the specified a resultset that wasn't specified... customers that had BOTH cheese qty>=2 AND bread qty>=5:

SELECT a.customer_id
  FROM ( SELECT p.customer_id 
           FROM mytable p
          WHERE p.product         = 'cheese'
          GROUP BY p.customer_id
         HAVING SUM(p.qty)        >= 2
       ) a
  JOIN ( SELECT q.customer_id 
           FROM mytable q
          WHERE q.product         = 'bread'
          GROUP BY q.customer_id
         HAVING SUM(q.qty)        >= 5
       ) b
    ON b.customer_id = a.customer_id

I've assumed you wanted to check the qty for a given product and given customer as the "total" qty, from all of the matching rows, not just checking for an individual row with qty over the specified value. If you only want to check qty on individual rows, then remove the predicate on the aggregate i.e. HAVING SUM(qty) >= n, and add a predicate in the WHERE clause AND qty >= n.

EDIT To get the result of customers that had EITHER, we could use a similar approach, getting the list of customers that had cheese qty>=1, and combining that with the list of customer that had bread qty>=5, and get a distinct list.

  ( SELECT p.customer_id 
      FROM mytable p
     WHERE p.product         = 'cheese'
     GROUP BY p.customer_id
    HAVING SUM(p.qty)        >= 2
  ) 
  UNION 
  ( SELECT q.customer_id 
      FROM mytable q
     WHERE q.product         = 'bread'
     GROUP BY q.customer_id
    HAVING SUM(q.qty)        >= 5
  )

There are some other approaches. You specifically asked about changing "rows" into "columns". The term we usually use to refer to that type of operation is "pivot" or "pivot table".

Here's an example of that approach:

SELECT t.customer_id
     , SUM(IF(t.product= 'cheese'  ,t.qty,NULL)) AS `qty_cheese`
     , SUM(IF(t.product= 'bread'   ,t.qty,NULL)) AS `qty_bread`
  FROM mytable t
 GROUP BY t.customer_id
HAVING `qty_cheese`  >= 2
    OR `qty_bread`   >= 5

Again, this is checking the "total" qty for a given customer and given product. If you just want to check for the existence of a row that has qty >= n, then replace the SUM() aggregate function with MAX().

For each row returned from mytable, we're evaluating an expression, comparing product to a specific value, and if it matches, we're returning the value from the qty column.

We typically do this in the SELECT list, just because it allows us to debug the query, we can leave off the "HAVING" clause, and get the list of all customers, and see the qty for each product.

It's not necessary that these expressions be in the SELECT list. These expressions could be moved to the HAVING clause.

SELECT t.customer_id
  FROM mytable t
 GROUP BY t.customer_id
HAVING SUM(IF(t.product= 'cheese'  ,t.qty,NULL))  >= 2
    OR SUM(IF(t.product= 'bread'   ,t.qty,NULL))  >= 5

There are a couple of other viable approaches.

spencer7593
  • 106,611
  • 15
  • 112
  • 140