1

Here is the pic --> ERD

Shouldn't the sales_rep_id attribute be in the orders table instead of the accounts table?

Usually, if there is a table for customer and another for seller, I see their FK's meet in order table.

And with this ERD, if a sales_rep has been changed with a new one, and we wanted to see every sales_rep with his orders, will the old sales_rep appear with his old orders, or his old orders will belong to the new rep?

Like this pic

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Osama Negm
  • 21
  • 1
  • small note -- why not embed the pictures here? – Mike M Feb 04 '18 at 02:12
  • 1
    @MikeM: People with a reputation of 1 cannot embed pictures into their questions, I believe. 'Tis a nuisance on occasion — while C programs rarely need pictures, database designs often do (though they can be represented textually). – Jonathan Leffler Feb 04 '18 at 07:44
  • As Mike M says, the problems you identify are real, but the schema is not set up to record historical changes — except perhaps the web events table, and it isn't clear what that actually contains. A 'real' schema would need to track the history more clearly. Often, that means things like the orders table containing copies of the information from other related tables as of the time that the order was placed, so that the historically relevant information can be preserved (so new sales rep John Doe doesn't get a bonus from all the sales made by the recently retired Abraham Smith, etc). – Jonathan Leffler Feb 04 '18 at 07:48
  • @JonathanLeffler - ah, good to know about the pictures – Mike M Feb 04 '18 at 07:50
  • 1
    It also means there'd be date ranges on the sales rep table indicating when the assignment was valid. It also means you tend not to delete the record for a sales rep — you mark it as no longer valid. The subject of temporal databases can get tied up with this; they're designed to keep historically accurate representations of database states. But essentially, a 'real' schema would have to be a lot more complex than the one shown in the question. – Jonathan Leffler Feb 04 '18 at 07:50

1 Answers1

1

Yes, you're raising important issues.

It's hard to really say one ERD is always right or wrong - we need to keep the goals in mind.

With the diagram you shared, you do have the problem you talk about.
By only having one Sales Rep for an Account, if the Sales Rep changes, you lose the historical record of who was the Sales Rep for orders before the change; the new Sales Rep will indeed be connected to every order, past and present.
So indeed you might want to put the Sales Rep in the Orders table.

BUT... how will you connect the Sales Rep to the Account then?
Does that relationship exist? What tables will we use for that relationship? A new table just for associating Account_Id, Current_Sales_Rep_Id?

If we step back and think further, we could even say that the ERD you shared is correct, and that any time the Sales Rep changes we have a new Account.
This would also avoid the problem of "erasing" the past Sales Rep.

It's always a question of knowing what is correct for the entities you're modeling.

But, if you're not sure exactly what the system needs, don't forget the fundamentals of Normalization.
Highly Normalized data designs protect you from data loss such as this Sales Rep issue.
Here are a couple SO pages with lots of info on Normalization, if you need it:

Mike M
  • 1,382
  • 11
  • 23
  • As a side note, you might consider defining Primary Keys on meaningful columns in each table, not the RowNumber/AutoInteger/Identity id column. This forces you to think even more about the real meaning of each entity. You can still use the id column as the join key; define a unique constraint or index on it if that isn't automatic in the target db. – Mike M Feb 04 '18 at 02:37
  • --- I suppose I should have said in my previous comment that you should check first the target db that you can indeed make such joins on a unique integer that is not "primary key"...... I don't really know all databases :-) – Mike M Feb 04 '18 at 02:47
  • Normalization is good but not the solution for conceptual design flaws. If a data modeler isn't sure exactly what the system needs, they need to meet with the customer and/or domain experts and discuss the model, how the system will be used and what the consequences of different design decisions will be. – reaanb Feb 04 '18 at 05:33
  • @reaanb -- Of course :) . But when you can't get to a perfect model with all the people involved, the flexibility of being more abstracted is one way to manage the uncertainty. It's usually easier to denormalize data than to try to further abstract it later. The initial problem here is a good example. If we could perfectly and permanently define that there is always one Sales Rep per Account or that there is always one per Order, we might combine them. If we can't be sure, making all separate entities allows their connection to change more easily without re-defining them. – Mike M Feb 04 '18 at 06:28
  • @reaanb -- I'll throw a "but" in the Answer to help make clear the links about Normalization are not the single best answer. – Mike M Feb 04 '18 at 06:50
  • Normalization is orthogonal to the issue. Normalization is about losslessly decomposing complex facts into simple ones, but if a model doesn't accurately represent the business domain, it's just going to give you a simpler, consistent, anomaly-free but still wrong model. – reaanb Feb 04 '18 at 07:32
  • @reaanb - I get what you're saying. Note I'm not disagreeing with you, at all, about the centrality of defining the model for the specific case; that's also what I tried to make most important in my answer. But I do sincerely believe Normalization does help in the way I describe. Over time, you can change things like an association table connecting two entities much, much easier than trying to separate and redefine combined entities. – Mike M Feb 04 '18 at 07:58
  • @reaanb -- I think maybe the timeline of how the model is made may be where we are not on the same page...... I'm saying that in a real project we reach an endpoint in discussions about the model, for any number of reasons including time, money, or simply the information available. After that point, where we can't improve what we know will happen with the entities, I would suggest a bias towards separating something I can't be sure should be combined. – Mike M Feb 04 '18 at 08:06