2

I'm trying to select products matching a customer's preferences (i.e. the colors they like), but I'm having trouble creating an IN statement when the value that's being assessed in the subquery is an array.

I'm only including a few fields in order to simplify the question, but there are a few hundred fields for each customer and about half of them are arrays, and there are apx. the same number of fields for each product.

This query doesn't work if more than one color has been selected:

$result = mysql_query("SELECT * FROM products WHERE color IN 
(SELECT colors_love FROM style WHERE style.user_id=$user_id)")

I also tried exploding the array like this:

while ($row = mysql_fetch_array($result)) {
$colors_love = explode(',', $row['colors_love']);
echo $row['product_name']. ", ". $row['style']. ",". $row['color']. ", ". $row['price'];
}

Since that produced the same results, I'm assuming that's either not the issue, or that it needs be exploded before it is used in the query - which I'm not quite sure how to do.

Chaya Cooper
  • 2,566
  • 2
  • 38
  • 67
  • 2
    This problem is generally solved by normalizing the database structure. If you had tables for product_colors, and user_colors, then you query would be a join of these two tables (restricted to user_id) to get the PK for product followed by a join to products to get the actual products. But, this doesn't answer your question. – Fls'Zen Dec 28 '12 at 23:55
  • I don't think I can take that approach here. In order to simplify the question I only included a few fields, but there are a few hundred fields for each customer, and about half of them are arrays. – Chaya Cooper Dec 29 '12 at 00:08
  • There are similar question already on SO, and the general consensus seems to be running a stored procedure to store the exploded values into a table so the join can be performed. Any way you dice it, it's not going to perform very fast.(http://stackoverflow.com/questions/5096584/how-to-expand-comma-separated-field-into-multiple-rows-in-mysql) Is this something that you do for one user at a time, or is it something that you need to do for all the users at once? – Fls'Zen Dec 29 '12 at 00:15
  • Also, is products.color just one color, or is it a list as well? – Fls'Zen Dec 29 '12 at 00:16
  • 1
    WHen you say "array", are you talking about a comma separated list? – Sebas Dec 29 '12 at 00:19
  • @Chaya is colors_love a single value or comma separated string? – Atheer Mostafa Dec 29 '12 at 00:54
  • @Fls'Zen - Since I have many other fields with the same issue (there are 100+ customer fields which are comma separated lists), would it make sense to run stored procedures to store the exploded values for each of those fields? The product recommendations are being made to many customers at a time, but I was actually thinking of running a stored procedure to match the customer to relevant products in advance (when their account is created/updated and/or when products change) – Chaya Cooper Dec 29 '12 at 01:00
  • 1
    @Fls'Zen products.color is just one color (I was trying the simplest example 1rst ;-) ), but some of the other product fields are comma separated lists – Chaya Cooper Dec 29 '12 at 01:02
  • @Chaya When I've come across this situation in the past, the solution is often to apply some form of data warehousing. Solutions can range from the stored proc processing you mentioned to having a separate database running a data warehouse with ETL jobs keeping the data warehouse up-to-date with fresh data from the transactional database and pushing relevant information, such as per-customer product ranking, back into the transactional database. Having a data warehouse brings benefits, such as the ability to easily analyze the data (buying trends, etc.), but it also requires the most effort. – Fls'Zen Dec 29 '12 at 01:19
  • @Fls'Zen Data warehousing sounds like a much smarter approach (especially since those abilities will definitely be important in the future), and I'm not afraid of taking a labor intensive approach :-) But since I'm just learning a lot of this, I'm wondering if it'll be a lot more complicated for me to figure out how to do? – Chaya Cooper Dec 29 '12 at 01:35

3 Answers3

0

Do it with EXISTS as follows:

SELECT * FROM products WHERE EXISTS
(SELECT 1 FROM style 
WHERE FIND_IN_SET(products.color,style.colors_love)>0 and style.user_id=$user_id)
Atheer Mostafa
  • 735
  • 3
  • 8
  • I didn't know about EXISTS, so thanks for sharing that with me :-) I think the syntax might be a bit off because it doesn't show any results (there are 3 items which match in my test code), so I'll try to learn more about using EXISTS so that I can hopefully use it. – Chaya Cooper Dec 29 '12 at 00:46
  • Try then replace: WHERE FIND_IN_SET(products.color,style.colors_lov)>0 – Atheer Mostafa Dec 29 '12 at 01:00
  • I tried the updated code in your answer, but that's giving me the error message "Incorrect parameter count in the call to native function 'FIND_IN_SET'" :-( – Chaya Cooper Dec 29 '12 at 01:16
  • Chaya, it was a spelling mistake the comma instead of = try it again :) – Atheer Mostafa Dec 29 '12 at 01:36
  • That did it :-D Out of curiosity, would you mind explaining to me why you would use 'SELECT 1' here? Also, how does this method compare to @Sebas's suggestion of using INNER JOIN and FIND_IN_SET? – Chaya Cooper Dec 29 '12 at 01:48
  • for EXISTS statement, it's enough to be any number of return rows whatever their form or columns, so just a single 1 as return would make the EXISTS true. I prefer EXISTS for these cases as INNER JOIN may result in duplicates (in some other cases of one to many relation for example). I also like EXISTS because I feel it more readable to me. Also consider several cases of EXISTS or NOT EXISTS you can't do with JOINs. – Atheer Mostafa Dec 29 '12 at 01:55
  • Just to clarify that I understand what you're saying about 'SELECT 1' - it's true because one or more rows were returned - right? The point about several cases of EXISTS is actually really important because I'll have to evaluate many fields in this query. Would I just add "AND FIND_IN_SET(...)>0" to do that? – Chaya Cooper Dec 29 '12 at 02:05
  • 1
    Yes just add more conditions to the WHERE filter. – Atheer Mostafa Dec 29 '12 at 02:12
0

It's always better performance & practise to use direct joins rather than IN or EXISTS or functions like FIND_IN_SET where possible. The following query achieves your goal without using IN or EXISTS or other functions:

SELECT
    products.*
FROM
    products
INNER JOIN
    (SELECT style.colors_love
    FROM style
    WHERE style.user_id = $user_id) AS fave_colors ON 
        fave_colors.colors_love = products.color
Manachi
  • 1,027
  • 16
  • 30
-1

It sounds like you're trying to show the customer a list of products in colors which the customer has already stated s/he prefers (colors_love).

This is pseudo-code, but this structure may get what you want, without worrying about exploding your array:

select 
    customers.cust_id, customers.fav_color, products.id, products.name
from
    customers
        inner join products
            on customers.fav_color = products.color
ExactaBox
  • 3,235
  • 16
  • 27
  • fav_color is a list of values, it won't work. (example: products.color = 'red' and customers.fav_color = 'red,blue,orange') – Sebas Dec 29 '12 at 01:08
  • Since colors_love is sourced from the DB, I assumed creating the array had been (unnecessarily) done in PHP. OP has no data samples. I made an assumption -- no need for a downvote. – ExactaBox Dec 29 '12 at 01:20
  • this is the only problem of SO: downvoting is often badly interpreted. It is only here to outstand the fact that the answer is not relevant, nothing personal at all. 2 options: delete your own answer (cancels the minus) or adapt the content? – Sebas Dec 29 '12 at 01:23
  • Fair enough -- I appreciate your explanation. – ExactaBox Dec 29 '12 at 01:33
  • I really appreciate this conversation :-) Both in terms of explaining when something won't work (so that I don't spend hours trying to figure it out ;-) ), and also explaining the downvote because SO can often be a bit daunting to noobs like myself – Chaya Cooper Dec 29 '12 at 01:43