Single OR Grouped/Combined Items for Entries in a Database

Question

I have 2 Similar Types of Things that I want to point to in a certain field in a Database. One of them is a combination of 1 or more of the other.

How should I Design my Database in this kind of situation?

In my current example I have (Simple)Food Ingredients and (Combined)Food Dishes and I want Either One or these Things to be entries in a Meals/Eating table.

So a User can Either Eat a simple Food like an Apple OR a complex food like an Apple Pie that consists of 200g of Apples and 100g of Flour and 30g of Sugar etc. at one point in time in a Meal. I'm thinking something like this:

Dishes |DID| |Name| (|Calories|???)

Food Data |DID| |IID| |Amount|

.

I Find this really annoying tho because Every Single Ingredient would have to have Two (Basically Identical)Entries to start with: 1 in the Ingredients Table and 1 in the Dishes Table so it could be paired up in a Meal. Am I missing something Here? Is there a way around this?

Also I don't know if a Dish should have the Calories Listed in the Database. Having the Calories for a Dish in the database is rather Redundant because it could be Calculated when Making a Query(by summing up&calculating its respective ingredients). BUT this seems quite inefficient since it this calculation would have to be done for every single query of a dish(and it would get worse by adding things like Macros/Nutrional Values/Price which I left out for clarity/simplicity here).

Also If I DO have Calories(and other things relating to food in general) for a Dish I could just have 1 single table in this scenario like:

Food |FID| |Name| |Calories| (|Simple[bool]|?)

Food Data |FID| |FID| |Amount|

This would Seem better in general. The Simple field would distinguish between Simple Ingredients or Dish which I think is worth putting in so you don't have to search in Food Data for every item.

BUT If I want to introduce Specific Dish-Only Data then I would to make some Other Table like:

.

So the Question is: What the BEST General Practice in this kind of scenario?

Is it generally better to do extra calculations when querying rather than have redundant data in these kinds of situations? Is there something I'm missing that would make this simpler/better in general?

What is a... "thing"? I find it hard to understand what you would like to achieve. Join several tables and then query them? — Jacek Kotowski, Feb 21 '20 at 11:05
I mean... I wrote out the entire scenario pretty much I don't how else to say it. In this case the THING is Food Item. Which is either 1 simple/basic ingredient or a dish made out of multiple ingredients with various amount values each. I'm asking how to Best Design the Database in a situation like this. — Rotariu.Stefan, Feb 21 '20 at 11:38
Try to summarise what you want in the first sentence. I start reading: "I have 2 Different Types of Things that I want to point to in a certain field in Database." and I give up. It looks like you would like to demonstrate a phatic function of a language, i.e. open a channel of communication without conveying anything specific. I understand it is often difficult and it happens to me in spoken language too. Try to be specific from the beginning. Remove 'weirds' and 'sorrys' — Jacek Kotowski, Feb 21 '20 at 11:45
Well yes. It's a general question. I can come up with plenty of situations like this one. I quickly stated the general scenario and then gave the specific example I currently have to work with. How am I supposed to say this? It can Literally be Anything. Like 1 Person or a Group of People. Or 1 Photo or a Photo Album. Except In this case theres the added element of things combining in different amounts.What the hell am I supposed to say? — Rotariu.Stefan, Feb 21 '20 at 12:11
I changed the title. Maybe that's clearer. Maybe you could read what I actually wrote instead language critique of my 1st line? — Rotariu.Stefan, Feb 21 '20 at 12:19

downernn · Answer 1 · 2020-02-21T23:43:52.777

I'm not sure this can be answered as generally as you would like, because the semantics and the use of the database should be taken into account. Even in the simple/complex food context of your example, either of the approaches you describe (ingredients/dishes/food_data or food/food_data/dish_data) can be right, depending on the specifics.

Let me get this out of the way first: I wouldn't look for a third approach. Any other thing I can think of would be semantically obscure, hell to maintain or a nightmare to query.

So your first concern is the semantics of the database. Your first approach seems more natural; most people will easily see the semantic distinction between ingredients and dishes. It is also the only option if the "ingredient" entity has another reason of existence besides being part of a dish, e.g. for managing orders of raw ingredients. If you choose to go with the second approach you will have to make sure that a) it fits your data and b) you choose your table names very very carefully.

For the second approach to "fit your data" semantically, simple dishes must fully fit the description: "dishes that don't have the extra dish_data". The [Simple] flag is also acceptable as a property of dish, though a real need for it can be a hint that you're off base with this approach. But if ingredients and dishes only partially overlap, i.e. if you have ingredients that cannot be dishes, or if they have different properties in general, then you are definitely off base. If you find yourself in need of enforcing business rules that would prevent a customer from ordering a serving of "flour"; if you raise questions like what to put under "calories" for the "pickles" (would it be the calories per 100gr for pickles-as-an-ingredient, or the calories per serving for pickles-as-a-side-dish?); if you find you have fields like "measuring unit" that are meaningless for dishes, then you're dealing with two separate entities (ingredients and dishes), not one entity (dish) with two subcategories (simple and complex). If you are only going to duplicate a tiny bit of information between the two tables and save yourself a lot of trouble and ambiguity, by all means do that.

Your second concern is how the data will be used. Try to answer questions like: Are you going to be querying calories of dishes millions of times per second? Are the ingredients - and therefore the calories - of dishes going to stay the same for ever? Will your customer or cook ever need to query what a dish is made of?

"Don't duplicate" and "don't store calculatable values" are two rules that are as hard as design rules come. Even such rules though should be, not really bent, just "critically adjusted" some times, if that makes sense.

"I'm not sure this can be answered as generally as you would like," That's kinda what I was afraid the answer would be. You're getting me to ask even More questions now hehe. It seems to me that you'd favor the Split/Separate tables approach more tho. When you say "as hard as design rules come." you mean hard as in Difficult to Implement or hard as in "You Gotta do this! It's important" . I guess you're saying I probably have to compromise somewhere, right? Anyway, thanks! — Rotariu.Stefan, Feb 22 '20 at 22:08
Yes, I think the separate tables approach would be cleaner in most cases I can think of. But, as I said, it depends on the specifics, in special cases the second approach could be better. "Hard" was intended to mean "shouldn't be bent or broken", so as in "You gotta do this". So I meant to say, as important and useful as some rules are, sometimes you just gotta use your judgment and experience. To be honest though, I think the separate tables approach doesn't even really truly break the "don't duplicate" rule. It just (falsely) looks like it does. — downernn, Feb 22 '20 at 23:19
By the way, I recently answered [this question](https://stackoverflow.com/questions/60352682/mysql-joins-choosing-which-table-to-join-from-based-on-source-table-data/60353185#60353185). I think it is related to your original thought of having a field in your database that would point to two different entities/tables. If I understood correctly, you were wondering if you could link your user order with either a simple food item, held in one table, or a complex dish, held in another and this question is about ways to implement this. But I must say, I strongly feel that would be bad design. — downernn, Feb 22 '20 at 23:30
Yes. It's a fairly similar case. As far as queries go I'd prefer yours instead of the COALESCE method there. Like the author in that thread I find it hard to search for specific answers to this/phrase what I want. But What are you Referring to with "that would be bad design" ? You mean something in that other other question you answered or something I wrote? — Rotariu.Stefan, Feb 23 '20 at 13:13
I meant the idea of having a field point to more than one table is bad design. The complexity of the queries proposed in the other thread (as an answer to what should have been a simple inner join) proves so. Also, the fact that you wouldn't be able to apply fundamental database design practices, like foreign key relationships and cascading, is another indication. — downernn, Feb 23 '20 at 14:57
Right, I kinda had that sense as well. That's why I asked "Am I missing something Here? Is there a way around this?" But I don't see how else you COULD Design something like this. It just kinda looks bad no matter what and I've been thinking about this for the last 2 days and really struggling to make a decision how to proceed :( — Rotariu.Stefan, Feb 23 '20 at 15:37

score 0 · Answer 2 · answered Feb 21 '20 at 15:21

This is a question of understanding the context of your data. I imagine meals can be simple (unprocessed) or be complex and consist of other meals. If I were to generate a database for meals and their calorific value I would not separate them.

meal      | calorific value per 100g | glicemic index  
apple     | 12345                    | 34234
apple-pie | 3233                     | 32334

Other table you would join it with could be a meal composition for a specific person.

2020-02-27|Johny Doe | Breakfast |apple  | 300 g
2020-02-27|Johny Doe | Breakfast |sausage| 150 g
2020-02-27|Johny Doe | Breakfast |apple-juice | 500 g

By joinning the two tables you would learn how much Johny Doe ate callories and perhaps what was the glicemic index...

Then... it is not yet an SQL question but a the question of understanding first the process one would like to describe with SQL.

Single OR Grouped/Combined Items for Entries in a Database

2 Answers2