0

I have a MySQL table which is named fruits:

enter image description here

I want to select rows from the table where the columns fruit1, fruit2, and fruit3 only contain values in the set [apple, pear, melon] i.e.

  1. only have apple
  2. only have pear
  3. only have melon
  4. only have both apple and pear
  5. only have both apple and melon
  6. only have both pear and melon
  7. only have apple, pear and melon

For my sample data, the rows will be 1, 5, 8.

I have tried this query:

SELECT * FROM fruits WHERE 'apple' in (fruit1, fruit2, fruit3)

which gave me all the rows that contain apple in one of the fruit columns. However I'm not sure how to filter on multiple columns.

Nick
  • 138,499
  • 22
  • 57
  • 95
Hoo
  • 135
  • 13
  • at the most basic level you'd need one clause (or group of clauses) in your WHERE for each of your 7 conditions. The clauses would likely to be joined by ORs – ADyson Nov 06 '18 at 12:14
  • Thank you for your comment, but what if elements is more than 10, 20, or more? – Hoo Nov 06 '18 at 12:17
  • 1
    This looks like bad design. Ideally you should have a row for every fruit within area. – P.Salmon Nov 06 '18 at 12:23
  • 1
    @Hoo then you've got a lot of writing to do. But as P. Salmon says, your data looks to be de-normalised, which is probably contributing to the difficult of writing a query like this. – ADyson Nov 06 '18 at 12:31
  • What if there's a fourth fruit? – Strawberry Nov 06 '18 at 12:35
  • Are you open to **fixing** the wrongs ? If yes, then instead change the table structure to something like: `id`, `fruit_no`, `value`. You can read up on Entity Attribute Value (EAV) pattern: https://stackoverflow.com/questions/494158/best-beginner-resources-for-understanding-the-eav-database-model – Madhur Bhaiya Nov 06 '18 at 12:39
  • @hoo in the real world you aim to write code which is robust,scalable, repeatable undertandable and a whole load of other ables. What you don't want is a query which 'works' but produces the wrong result - for example you could write some code which produces the correct result for 3 columns but then some other part of the organisation adds another column without your knowledge result query 'works' but is incorrect. But this will never happen I hear you say - wanna bet I say. Avoiding this issue comes back to design and change control and version control. The latter 2 are a necessary evil. – P.Salmon Nov 06 '18 at 12:55
  • @P.Salmon Thank you for your advice! Actually, I'm a beginner of programming and I'm making App by myself. So I can control a sudden situation, but I totally agree that I need to write the code which is robust, scalable, repeatable, undertandable. But I have no experience of 'trial and error' to judge this situation, because it is my first trial on database. Perhaps, I can fully understand all of your worries in later. – Hoo Nov 06 '18 at 13:06

2 Answers2

2

Ideal Solution:

You can normalize it by creating two tables instead. First table would be storing id and area.

Create Table area_master (area_id int auto_increment, 
                          area varchar(32), 
                          description varchar(32)
                          Primary Key(area_id));
Insert into area 
       (area, description) 
Values ('areaA', 'Hi'), 
       ('areaB', 'Hello'); -- add more rows as you need

Second table would be storing area_id (FK from area_master table), attribute, value, attribute_order:

Create Table area_fruits (id int auto_increment, 
                          area_id int, 
                          attribute varchar(32), 
                          value varchar(32), 
                          attribute_order int
                          Primary Key(id));

Insert into area_fruits 
       (area_id, attribute, value, attribute_order) 
Values (1, 'fruit', 'apple', 1),
       (2, 'fruit', 'banana', 1),
       (2, 'fruit', 'apple', 3); -- add more rows as needed

Now, the query to fetch the data would simply be the following, and you will not need to change it every time, as you add new fruits :-)

SELECT am.id, am.area, am.description 
FROM area_master AS am 
JOIN area_fruits AS af ON af.area_id = am.area_id AND 
                          af.attribute = 'fruit' 
GROUP BY am.id, am.area, am.description 
HAVING SUM(af.value IN ('apple', 'pear', 'melon')) = COUNT(*) 

P.S: Lot more refinements is possible, such as adding Foreign Key constraints, etc. But I hope you get the gist.


Original Problem:

  • Using Where we consider only those rows where atleast one the fruit columns have either apple, pear, or melon
  • Now, we do a Group By on the id, and use Having to ignore the rows where fruit columns are not empty, and their value is something else other than
    apple, pear, and melon.

Try the following:

SELECT id 
FROM fruits 
WHERE fruit1 IN ('apple', 'pear', 'melon') OR 
      fruit2 IN ('apple', 'pear', 'melon') OR 
      fruit3 IN ('apple', 'pear', 'melon')
GROUP BY id 
HAVING SUM(fruit1 NOT IN ('apple', 'pear', 'melon') AND fruit1 <> '') = 0 AND 
       SUM(fruit2 NOT IN ('apple', 'pear', 'melon') AND fruit2 <> '') = 0 AND 
       SUM(fruit3 NOT IN ('apple', 'pear', 'melon') AND fruit3 <> '') = 0

P.S This is a bad design. Whenever you have n number of similar columns, you should consider normalizing it.

Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
  • Thank you for your answer! Do you have any suggestion for design my database? Actually, I'm a newbie to DB! – Hoo Nov 06 '18 at 12:39
  • @Hoo You change the table structure to something like: `id`, `fruit_no`, `value`. You can read up on Entity Attribute Value (EAV) pattern: https://stackoverflow.com/questions/494158/best-beginner-resources-for-understanding-the-eav-database-model – Madhur Bhaiya Nov 06 '18 at 12:40
  • Thank you for your sincere answer! I think I can catch all of your worries in later. But in this moment, I slowly catch that I need to separate tables :) As mentioned above, actually, I'm a beginner of programming and I'm making App by myself. It is my first time to convert spreadsheet to database. My approach to database is still on the spreadsheet. I have no experience of 'trial and error' to judge this situation, because it is my first trial on database. Perhaps, I can fully understand all of your worries in later. But, thanks to your answer, I can move further! Thank you again! – Hoo Nov 06 '18 at 13:17
  • @Hoo in my personal opinion, if you get a chance to fix the structure in the early stage, you should do it. Once the code becomes big , it has its own inertia and hard to restructure – Madhur Bhaiya Nov 06 '18 at 14:51
  • You are right! I need to learn data structure theory(normalization, etc) in advance. Thank you! – Hoo Nov 07 '18 at 00:00
  • Will you life my question? I'm banned to ask question. Please help me. Thank you. – Hoo Nov 07 '18 at 03:36
1

I've worked on the assumption that blank fruit values in your table are NULL. In that case this query will give you the desired result:

SELECT *
FROM fruits
WHERE (fruit1 IN ('apple', 'melon', 'pear') OR fruit1 IS NULL)
  AND (fruit2 IN ('apple', 'melon', 'pear') OR fruit2 IS NULL)
  AND (fruit3 IN ('apple', 'melon', 'pear') OR fruit3 IS NULL)

Output

id  fruit1  fruit2  fruit3
1   apple       
5   melon           pear
8   apple           melon

If the blank values are just empty strings, change the query to

SELECT *
FROM fruits
WHERE (fruit1 IN ('apple', 'melon', 'pear') OR fruit1 = '')
  AND (fruit2 IN ('apple', 'melon', 'pear') OR fruit2 = '')
  AND (fruit3 IN ('apple', 'melon', 'pear') OR fruit3 = '')

Demo on dbfiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
  • Thank you very much, that's what I really needed! – Hoo Nov 06 '18 at 12:38
  • @hoo is it? - every time you add another column you are going to have to change your query. – P.Salmon Nov 06 '18 at 12:41
  • @P.Salmon Actually, I'm a beginner of DB. This is my first time to make DB. To be frank, I have no experience of trouble for changing my query on every time I add another column. I just thought that I can add 1 line on this code. And of course, I thought that it will not make any problem to users of this service. Do you have any experience to make me learn? – Hoo Nov 06 '18 at 12:49
  • @Nick will you life my question? I'm banned to ask question. Please help me. Thank you. – Hoo Nov 07 '18 at 03:36
  • @Hoo I've edited your question to make it read better. You will need to do the same to some of your other downvoted questions as you really need the users who downvoted the question to remove their downvote. – Nick Nov 07 '18 at 03:55
  • @Nick Thank you. But I'm not sure they will upvote. As you know, it is not because of code, but because of data structure modeling. Thank you anyway :) – Hoo Nov 07 '18 at 04:03